INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Drive reliability  (Read 3494 times)

tunetyme

  • Galactic Citizen
  • ****
  • Posts: 410
  • Have tunes will travel
Drive reliability
« on: May 30, 2007, 12:44:20 pm »

Thanks for all your help! 

It appears that MC has been doing auto backups daily since I've installed it 6 weeks ago.  This is the second Maxtor 250G SATA drive that has died on me.  They seem to last about 13 or 14 months.  For the last 15 years I have had great experience with Maxtor.  Time for a change. 

An observation regarding these large drives.  It seems that when you really fill them up 90% or more of the drives capacity problems start occurring.  I am seeing files being corrupted and othe various difficulties.  In future I do not plan on using any more tha 75% of the drives capacity.

I am curious what others have experienced with these high capacity drives.  What is the best brand?

Tunetyme
Logged

JimH

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 72444
  • Where did I put my teeth?
Drive reliability
« Reply #1 on: May 30, 2007, 12:50:28 pm »

An observation regarding these large drives.  It seems that when you really fill them up 90% or more of the drives capacity problems start occurring.  I am seeing files being corrupted and othe various difficulties.  In future I do not plan on using any more tha 75% of the drives capacity.

I am curious what others have experienced with these high capacity drives.  What is the best brand?
Google did a nice report on drive failures:

http://labs.google.com/papers/disk_failures.pdf
Logged

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Re: Database location
« Reply #2 on: May 30, 2007, 01:07:20 pm »

And if you read that report, they determined that there is no best brand.  They're all basically the same.  They also determined that there is little or no difference between consumer "desktop" drives and more-expensive "enterprise" drives as far as reliability.

I personally like Seagate and Western Digital.  Only because of the warranties though -- when a drive does fail, it's good to have a company you can count on for the warranty.  I've returned dead drives with both companies and both were very easy to deal with.  Maxtor was another story entirely (though they've been subsequently bought out by Seagate so they're probably fine now too).

Most of my Seagate and WD drives have either died in the first 60 days or last a good 3-5 years.  I've heard the "keep empty space" thing before and seen plenty of evidence that it is a myth.  You're probably just hitting bad luck.  I have plenty of drives that I use regularly that are almost 100% full (I offload old video files onto drives and then store them in a fire safe) without issue.  It could be that some software misconfiguration is causing the disk format to become corrupt... Two questions:

1. Are you SURE the drives are physically failing?  Usually you can do a hardware test using tools downloaded from the manufacturer.  If it is having corruption issues, but the drive tests out okay it could be: A) Software, B) bad cables (those ribbon cables break), and C) bad controller hardware on the motherboard.  You can usually also do a low-level format using those tools that will put the drive back to "really" blank which will often correct non-physical formatting problems.

2. Are you formatting to NTFS or FAT32?  If you're using FAT32 then stop doing that.  It's a horrible and unreliable disk format.  Every time an application locks up, if you're running on a FAT partition you have the potential that the disk format can become corrupted.  NTFS is a vastly superior disk formatting technology and you should use it whenever possible.
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

tunetyme

  • Galactic Citizen
  • ****
  • Posts: 410
  • Have tunes will travel
Re: Drive reliability
« Reply #3 on: May 30, 2007, 02:36:53 pm »

Thanks for your comments.  I thought the Google report was very helpful. 

I was working with the drive when it went south on me.  I bought 25 or so new CDs that I was planning on ripping and storing on the drive.  I was down to 8% available space so I copied all of my folders that began with "a" to my c: drive.  I had a problem with one song and I ripped that CD again.  It then copied to the c: drive fine.  I wanted to look at the e: or music drive to see what kind of space is available and determine if I needed to defrag the drive before I started ripping more music.  When I ran defrag and selected analyse it hung up. 

Some test indicate that the drive is not formatted.  Chkdsk indicates there are a lot of bad sectors (without the /F flag).  The manufacturers test software says the drive is defective.  My guess is that there are some corrupt sectors in the fat (NTFS). 

Any ideas or tools that I can use to repair the formatting?

Tunetyme   :(
Logged

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Re: Drive reliability
« Reply #4 on: May 30, 2007, 09:42:39 pm »

The manufacturers test software says the drive is defective.

Then it probably is.  You can try to do a low-level format on the drive and test it again to see if it passes.  Also try replacing the data cable connecting the drive.  A bad cable can lead to data corruption and failed tests.  A low-level format WILL destroy any data remaining on the drive, however, if you want to try it do a search on Maxtor's web site for instructions (it's usually through the same testing  software you used to test it).

If you really want to save the data (and are willing to pay) then DriveSavers is your best bet.

However, it's fairly likely that if it failed the manufacturer's test then it's a gonner. 
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

AndyCircuit

  • Regular Member
  • World Citizen
  • ***
  • Posts: 197
Re: Drive reliability
« Reply #5 on: May 31, 2007, 11:30:11 am »

Your problem is most probably heat related. I had some Maxtor issues lately too in my server while the Seagate and Hitachi drives worked fine. Searching for a reason I've found out that the drives were pretty hot and the Maxtors are just more sensitive than the others. I've sovled my problem with a good backplane like this one    http://www.cov-hamburg.de/html/bestellnummer-318.php3  but those are much too loud for a desktop. You might consider a less loud cooling.
BTW, good brand is relative. All brands had a series of drives who failed very often over the years and just because i.e. Seagate is fine ATM you can't be sure it is the same tomorrow
Logged
Electricians do it 'til it Hz

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Re: Drive reliability
« Reply #6 on: May 31, 2007, 11:37:55 am »

Interestingly, the Google Labs report cast a lot of doubt on the oft-repeated (but rarely backed up with hard data) claim that drive failures are often correlated with high operating temperature.  If you look around, almost all of the "evidence" for this "fact" is anecdotal.  When you look at huge numbers of drives in aggregate (as the Google study did) the results can be surprising...

Quote
We first look at the correlation between average temperature during the observation period and failure. Figure 4 shows the distribution of drives with average temperature in increments of one degree and the corresponding annualized failure rates. The figure shows that failures do not increase when the average temperature increases.  In fact, there is a clear trend showing that lower temperatures are associated with higher failure rates.  Only at very high temperatures is there a slight reversal of this trend.

Figure 5 looks at the average temperatures for different age groups. The distributions are in sync with Figure 4 showing a mostly flat failure rate at mid-range temperatures and a modest increase at the low end of the temperature distribution. What stands out are the 3 and 4-year old drives, where the trend for higher failures with higher temperature is much more constant and also more pronounced.

Overall our experiments can confirm previously reported temperature effects only for the high end of our temperature range and especially for older drives. In the lower and middle temperature ranges, higher temperatures are not associated with higher failure rates. This is a fairly surprising result, which could indicate that datacenter or server designers have more freedom than previously thought when setting operating temperatures for equipment that contains disk drives. We can conclude that at moderate temperature ranges it is likely that there are other effects which affect failure rates much more strongly than temperatures do.

Quote
One of our key findings has been the lack of a consistent pattern of higher failure rates for higher temperature drives or for those drives at higher utilization levels.  Such correlations have been repeatedly highlighted by previous studies, but we are unable to confirm them by observing our population. Although our data do not allow us to conclude that there is no such correlation, it provides strong evidence to suggest that other effects may be more prominent in affecting disk drive reliability in the context of a professionally managed data center deployment.

Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

hit_ny

  • Citizen of the Universe
  • *****
  • Posts: 3310
  • nothing more to say...
Re: Drive reliability
« Reply #7 on: May 31, 2007, 11:52:43 am »

Avg temp of my drives has been 45-50 (degrees C), they should be fine to 55-60 which is their spec'd operating range. All seagate.

no probs, i replace them every 3 yrs with bigger ones. I checksum my media, not seen any probs.

maxtor has got a bad rep for high temps, since the last few yrs.
Logged

AndyCircuit

  • Regular Member
  • World Citizen
  • ***
  • Posts: 197
Re: Drive reliability
« Reply #8 on: May 31, 2007, 12:45:05 pm »

Interestingly, the Google Labs report cast a lot of doubt on the oft-repeated (but rarely backed up with hard data) claim that drive failures are often correlated with high operating temperature.  If you look around, almost all of the "evidence" for this "fact" is anecdotal.  When you look at huge numbers of drives in aggregate (as the Google study did) the results can be surprising...



Do I have to assume now that the purpose of air conditioning in server rooms is to prevent nerds from sweating?
Logged
Electricians do it 'til it Hz

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Re: Drive reliability
« Reply #9 on: May 31, 2007, 01:08:51 pm »

Do I have to assume now that the purpose of air conditioning in server rooms is to prevent nerds from sweating?

Those old Netburst-based Xeon CPUs still need some serious cooling.   ;)  ;D

Besides, when we nerds sweat it's dangerously similar to (gasp) exercise outdoors, which would threaten our pasty-white complexions, and expose us to (the kryptonite of all geeks worldwide) sunlight.  Just thinking about this is enough to send many geeks into fits of panic and cold-sweats.
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

johnnyboy

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 626
Re: Drive reliability
« Reply #10 on: May 31, 2007, 09:20:21 pm »

The Air conditioning isn't for the hard drives, its there for the CPU.
The room doesn't have to be kept freezing, it just needs something that can get rid of the huge heat generated as it otherwise just keeps increasing and getting hotter and hotter and AC is the only way to do that really. Plus - the people running and saying what needs to be done in those server rooms are geeks who just regurge and act on what they've 'heard' and believe and they believe things they hear often enough, true or not.

Without that AC the server room gets insanely hot, the CPU's overheat and then the machines automatically all shut down - thats what the AC's there to stop.
I know - I've experienced it first hand when the AC in our server room packed in.
Logged

AndyCircuit

  • Regular Member
  • World Citizen
  • ***
  • Posts: 197
Re: Drive reliability
« Reply #11 on: June 01, 2007, 03:41:05 am »

Which part in particular of "I've solved the problem" can't you understand? I've bought 6 Maxtor SATA 250GB drives about 15 months ago and lost 3 of them within the first 3 months of operation. (2 completely, not appearing in BIOS anymore and one with a huge number of bad sectors...starting with error messages, not a sudden death) A simple fingertip temperature test was enough to tell my why. Putting the remaining 3 and the new ones (replaced by warranty) into the cooling backplanes solved my problem immediatly. All my drives, not just the Maxtors, running with a surface temp of about 30C (24/7) now and I haven't seen a single problem since.
Air conditioning is just to cool CPUs? Dream on. I was electrical technician for decades and am a computer dinosaur so I know both sides of the wall plug...and for sure a lot of server rooms and their admins. For example: The Canadian consulate in Hamburg is running a humble machine 24/7 (plus switch and the phone box) in a pretty big archieve room for their Ottawa connections. Although nothing compared to your and my machine(s) Ottawa demanded air conditioning for reliability reasons (I have to agree) I know about the details because they sent me again to connect all the stuff, including AC. Ottawa shipped the complete material with US standard power wiring, sigh. However, it was the goal to cool down the complete system, not just the lousy mid class CPU.
Oh, BTW...johnnyboy...nevermind, I never expected somebody here to see the irony of my second post (air conditioning = AC = initials of my moniker)
Logged
Electricians do it 'til it Hz
Pages: [1]   Go Up