Hi all it's been a while!
@nathan if your around, interested in your thoughts.
Most of the drives in my storage pool are 5+ years old and 2 have started accumulating bad sectors.
As I'm using stablebit drivepool and stablebit scanner with the scanner set to evacuate drives on detection of an error and I always have more capacity spare to cater to my largest 8TB drives if errors are detected.
I've not lost any data, (fingers crossed) but I was wondering what others do to maintain the long-term health of their storage pools?
Do you recycle and test drives or just replace them?
Do you just replace all drives after 5 years? thats getting pretty expensive eh!
So how do you manage your data when all your drives are getting on a bit? (apart from backups - which i do for my entire music collection and key other data)
This is what I'm currently doing to try get the maximum life out of my drives, though I realise pushing more than 6-7 years on a drive I'm probably asking for problems, but budgets are finite...
Before everyone comments and says you should just replace the drive, yes that's an option, but when you have many 8TB drives, even replacing with 16TB or 20TB drives gets pretty expensive very quickly and drives often have weak areas that fail and can then happily operate for many years more once the areas are remapped.
My thoughts and process...
1. A clean up and a cull.. yep it's hard enough to a clean up and de-clutter in the real world, little lone your digital life.. but I recently did a digital clean-up and deleted a bunch of stuff.. yes old files, old downloads, programs and multimedia stuff, older generations of media that im never likely to use again.. quite liberating actually! Amazing how much digital junk you can accumulate! A few TB in my case...
2. Keep a drive monitoring program running periodically reading all sectors on every hard drive to force ECC error correction and reallocation of failing sectors...
3. When I get a sector allocation error, the data gets automatically evacuated to other drives in the pool..
4. I then remove that drive from the enclosure and plug it into a separate enclosure for testing on a different computer. I've seen a failing drive cause weird controller behaviour that can cause other filesystem and drive issues due to excessive time outs and retries, so better to isolate the failing drive ASAP.
5. I check out the SMART logs obviously, and if it's 100s of errors it's probably time to destroy the drive and toss it..
6. If not a high number, I'll run Victoria HDD tool to write to every sector and make sure it forces a sector reallocation. The reason I like this free tool is that it gives very granular control and detailed logging and tracing of what's happening when every sector is written so you can make a call on what to do with the drive with a little more information than just a SMART log and full format. Just formatting doesn't really tell you much about the drive and if it's really failing or has other controller or mechanical issues.
7. If the drive doesn't have too many long retries and has no errors, I can look at getting it ready to back into the pool. (for me warnings are ok if the retries are under 5 seconds to allow a sector remap and controller reset during an attempted write)
8. So the drive successfully completes the full sector write test? Time to delete the partition and reformat (slow full format) Check the smart logs and as long a no big increase in sector reallocation, and no pending allocations, and it completes in a reasonable time (24hours for these big 8TB drives) time to then put it back in the pool and put it under some real-world load.
9. Because I'm using stablebit drivepool, it has a bunch of cool balancer plugins and can force duplication and file placement. So I can force some duplication and file placement onto the drive now it's back in the pool.
10. If the drive handles the realworld use for a week or so then I can safely put it back into the pool, If the scanner picks up more additional bad sectors at this stage it's time to replace it.
So what are others doing to manage long term data and drive health?
SMART by
Hilton, on Flickr
Victoria-write-test by
Hilton, on Flickr
stablebit-fileplacement by
Hilton, on Flickr
stablebit by
Hilton, on Flickr