DEGRADED ZFS pool, hard disk failure
I own a preowned HP z800 workstation. As purchased it contained HDD inside – Western Digital WD5000AAKX-75U6AA0, SATA III, 500GB, 7200 RPM, 16MB of cache. Unfortunately it passes SMART and do not show wearout metric. However going into detailed information we get:
Raw_Read_Error_Rate has positive value of 11, threshold is set to 51. Having 11 685 hours of runtime it should understandable that it might break and it did actually. There is only one Current_Pending_Sector which means that it waits to be remapped or rellocated. But, will it happen anytime soon?
I’m unable to clone, migrate or replicate VM to another server. ZFS states that this pool is in degraded state. You can see this by using zfs status -v
command. It says that is unrecoverable and most probably it is. I’ve tried zpool scrub river
to no avail. There is a problem with VM-104 disk. Still the forementioned VM is accessible by console and it works just fine.
This VM is Redash installation, and it is the last VM left on that drive waiting for better time. As this is home lab setup it makes use of any devices I have available. Not all of them are fully functional as you can see. Always need to have backups, replicated VMs and redundantly configured RAID.
I was unable to migrate VM:
I was unable to replicate VM:
I was unable to backup VM:
I ended up identifiying 4 or more bad blocks, but because it is ZFS there is little tools for filesystem checks. Does ZFS pool should be able in theory to recover from such failure? In case you use mirrored drive setup then it is not a case. I was thinking about overwriting these blocks, but leave it as it was. Drive is no decommisioned.