#StandWithUkraine

Bad idea to rely on new hard drive

This won’t be one of my usual horror stories. But this would be story of horrors that can happen to your data unless you pay attention.

Studies on hard drive reliability are extremely rare, most of information comes from manufacturers and data is synthetic - based on artificial tests and return rates. To fill the void Google Labs made exceptionally valuable study of hard drive failure rated based on their server farms.

hard_drive_served

hard_drive_served

Photo by F.S.M.

Early failures

It is known fact that hard drives are likely to fail in first year of operation. Study further narrows it down to first three months - that is critical time with high rate of failures.

Considering usual hard drive change scenario is transferring all information from old hard drive to freshly purchased – may be disaster recipe.

Instead of enjoying new pool of gigabytes right away it is better to:

  • put new drive through initial stress and surface test (will cover tools for this in future posts);
  • use new drive as secondary through first months;
  • keep old drive as backup medium (unless it is being replaced for failure symptoms).

Usage myths

High temperature and workload are usually considered to be negative factors to drive lifetime. Counter-intuitively study shows that high temperature has little correlation with failure rates (except for old drives) and drives operating at (suspiciously) low temperatures of below 35C are more likely to fail.

Similarly with work load - as long as drive gets through first critical months further usage has little effect on failure probability.

So while frying hard drive in crappy case is never good idea other than that you might use it without being troubled about “overworking” it.

Failure indicators (or lack of any)

SMART technology is incorporated in all modern hard drives and serves as self-diagnostic module. It must be enabled in BIOS and indicators can be read (and optionally diagnosed) with various software such as SpeedFan or PC Wizard.

However while bad SMART values certainly indicate problem with drive, study concludes that large percentage of failing drives show no sign of SMART failing.

Missing parameters

One of the most interesting parameters that study was unable to reliable access and process was drive vibration. There is very little data on how vibration affects drives in long term. It may be especially critical for drives that are mounted in soft ways in silent computers to reduce noise.

Manufacturers recommend hard drives mounting so sacrificing that for reduced noise is basically gamble without any real data on possible results.

There is also interesting topic of turning on/off stress. Server drives (on which study is based) run 24/7 minus repairs but that is rarely case for home and office computers.

Personal experience

While I don’t have direct experience working with huge quantity of hard drives I had certainly seen loads of those over years.

I can confirm early death syndrome (it makes perfect sense after all) and I was never believer in high load theories (seen and made some drives run for years in unfavorable conditions).

There is one additional factor that I wanted to share which Google couldn’t test. Server hardware is not moving anywhere but it’s clearly not uncommon for home desktops.

Physical factors such as blows, frequent transportation, moving while hard drive is active can have devastating effect on drive reliability. It is good idea to place computer in such way that it is unlikely to be disturbed.

It is also why I have little faith in external hard drives as backup medium - they are good for transportation but very same moving around makes them vulnerable to physical damage.

Overall

So main points would be:

  • do not trust newly purchased drives until they’ve been though testing and some usage;
    you can’t be sure to predict hard drive failure;
  • always have backup strategy implemented, automated and periodically checked (I use Cobian Backup, SyncExp and Dropbox for mine).

Study page research.google.com/pubs/pub32774.html

PDF download research.google.com/archive/disk_failures.pdf

How reliable do you think hard drives are? Do you trust them as storage medium or think that they are likely to fail?

Related Posts

9 Comments

  • Angelo R. #

    Hey at least this is a start right? Because of studies like this people are bound to start testing on their own and posting their findings on the net. I for one, would love to do some stress tests and such on hard-drives. My parents computer of 8 years, on which I first began my forays into development just recently died. They leave it on 24/7 at most putting it to sleep occasionally. Since I'll be the one replacing all the parts, I thought it would be a good idea to go replace their old hard-drive with a nice fresh SATA II drive.. perhaps I'll pick up the drive now and start testing..
  • Rarst #

    @Angelo Actually it's better people shut up about their in-home results. I quite hate when someone takes one piece of hardware and extrapolates "findings" to everything manufacturer ever produced. I had once seen person hating all of Samsung products because he had once bought printer with factory defect (which was naturally warranty-replaced). I use some HDD test tools worth posting about (MHDD, Victoria), but they are not high priority... Frankly HDD (and RAM) testing is one of most boring things ever. Takes hours (days if being thorough) and rarely finds a thing. :) They can also be quite harmful if not used properly. It is relatively easy to nuke or lock the hard drive with good low-level tool.
  • Talk Binary #

    I recently bought a hard drive and the one problem I had with it, is that it kept disappearing from My Computer. Then when it would reappear, I had to "reformat" it. Scary, since I had data on it! Found out I had to rewrite the MBR of the disk. Then, I realized I had to use a better SATA cable. Problem fixed! (: I'll probably write a post on my horror story so others can avoid freaking out like I did initially.
  • Rarst #

    @Diego Luckily I didn't encounter any problematic SATA cables so far. But IDE cables always sucked. At least problems with cable often show up on SMART (forgot exact indicator) so relatively easy to catch.
  • Talk Binary #

    Yes, it took me awhile to figure out it was a SATA cable. It is not like I had them handy. I had to ask a friend. Word of advice, DON'T buy cheap cables! Also, don't expect your data to be magically "formatted". It might still be there. Simply your MBR might be corrupted.
  • Rarst #

    @Diego I have tons of cables around. Never bought one separately, they just pile up over time. Doing some cleaning on vacation and had to sort through them and filter some I-have-no-clue-what-this-is-for cables out. :) Hard drives cables are easy to amass around where computers are assembled or upgraded. Motherboards come with bunch of cables why only two (HDD+DVD) are actually used most of the time.
  • Nathaniel #

    What would you recommend for thoroughly stress-testing a new HDD? I'm thinking of picking up a couple of Samsung F1s because they're dirt cheap, but there seem to have been quite a number of people complaining about bad sector issues very early on.
  • Rarst #

    @Nathaniel Heh, I want to do some posts on HDD testing all the time but hesitant. Line between what can be safely used and dangerous non-user grade tools is very blurred here. If you are looking for maximum functions (and complexity that comes with that) I'd say MHDD is still best (for free at least) http://www.softpedia.com/get/System/Hard-Disk-Utils/MHDD.shtml If something simpler - I quite like HDDScan http://hddscan.com/ Interface in latest version is questionable but it is good for surface scans under Windows.
  • HDDScan – hard drive surface test and benchmark | Rarst.net #

    [...] on Google hard drive study covered important issue that first months are critical to determine hard drive health. Bad blocks [...]