INTERACT FORUM

Please login or register.

Login with username, password and session length
Advanced search  
Pages: [1]   Go Down

Author Topic: Cost effective way to maintain a healthy array, 20TB or bigger?  (Read 21049 times)

Daydream

  • Citizen of the Universe
  • *****
  • Posts: 771
Cost effective way to maintain a healthy array, 20TB or bigger?
« on: December 10, 2012, 02:08:30 am »

Let's make a long story short: I know how DB operates, you don't have to explain. Which makes it more puzzling while the drive went to hell, its partition table obviously damaged. Maybe it was a weird coincidence, maybe it was because of planets' alignment :)

Nothing bad to say about DB, it worked for about an year untouched. However now I found myself thinking about the bigger picture, and I found it... complex and problematic.

What is the most cost effective way to maintain a healthy array, 20TB or bigger?
Money - through a 'common sense' perspective - is an important aspect. At the risk of hijacking the thread a bit here's what I'm thinking:
- backup is not an option because I'd need 40TB then or I'd do nothing else 24/7 but to upload to Crashplan. For 6 months.
- HDD prices seems to be at the level of "and we'll never fully recover after the Thailand flood".
- HDD quality and endurance seems to become worse as time goes by and capacities per drive increase. Looking at either prices or reviews for 4TB drives on Newegg is rather scary.
- is this a subliminal way to force us to go to the cloud?
- I will never understand why a high-end RAID controller cost $1000 or more, and because of that and other costs associated with typical RAID arrays, I don't foresee going hardware RAID any time in the future; or ever.
- Drive Bender (and similar) works very well, but it's not enough. _On purpose_ for the last few years I ran setups with drive pooling one way or another, without parity. Granted, when I began I wasn't at 20-30TB but much less. As the array and drives grow bigger the equation changes; it's one thing to re-rip 500GB of Blu-Rays and something else to do it for 3TB (or more). It just takes too much time, assuming all other things are handy.

Because of the above and other things, I'm thinking ZFS or FlexRAID. Any thoughts?
Logged

InflatableMouse

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3978
Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #1 on: December 10, 2012, 03:00:32 am »

How do you intend to setup ZFS? You'd need a dedicated machine with some Linux distro and go through hell to set it up (unless you're experienced with unix/linux OS's) or go with an appliance-like distro which turns a pc into a NAS device, like Freenas.

Problem is nothing else will read or support ZFS. Virtually no 3rd party tools support ZFS and the drivers are at beta or even alpha stage. For critical applications, your backups, your music and movies, I would strongly advise NOT to go for ZFS, unless you're an absolute guru on Linux and can troubleshoot it blindfolded.

At the time I chose for Drive Bender, I looked into FlexRaid as well. There were massive problems with the drivers and people complained about their support. I'm not sure what the issues were exactly but it was bad enough for me to not even trial it. Things may be different now, but I'd tread careful if I were you.

If raid-5-like parity is what you want and you're not afraid to get your hands dirty, you may look at SnapRaid or DisParity. This works by means of a command line tool which you can run on a schedule to create parity blocks on a drive/folder. It takes some time to figure it out and its not the most user-friendly but it does work if you manage to set it up properly.
Logged

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #2 on: December 10, 2012, 03:15:11 am »

It really depends what you need;
- How much space you need
- What your desire for "backup" vs "availability" is

FYI - Initially I had two pools of HDD
1) On my main PC that is used to serve all the media I have now have 25+TB the bulk of the space is BD rips
2) My WHS box was my backup for all the PC's and I also robocopied my media (minus the BD rips) to another much smaller 8TB pool

After I had a bunch of my 2TB HD Green die at the same time I've been upgraded to 4TB HDD Hitachi and rebuilt my WHS box with the same size pool as the pain of re-ripping 100's of BD was a PITA.

I got a bunch of the 4TB hdd during the black Friday sales at $200 per disk from B&H.  Cheap.

My WHS box that backs up everthing is in separate location that should not only protect me from drive failure, inadvertent file deletions and changes but also theft, fire etc...  All things RAID will not do.  This all comes at a cost however.  I've been spacing the costs out by purchasing a couple of drives every few months.

Logged
JRiver CEO Elect

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #3 on: December 10, 2012, 03:23:12 am »

When I say "cheap" I mean on a per item basis.  A 4TB HDD ($200) will hold about 100 BD (to keep the math easy) which is $2 per BD to Rip to the main pool and another $2 if you want to back it up to a second pool.  When you think how long it takes to re-rip, import tag etc, the extra $2 is money well spent for me.
Logged
JRiver CEO Elect

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #4 on: December 10, 2012, 03:59:39 am »

Also - When I redid my main PC I added a "quality" Highpoint Rocket Raid 8-Port SAS/SATA controller ... and after moving all my content to it "trashed" my content.  Windows CHKDSK saw errors and after it "fixed" it I had all sorts of sector allocation errors.  In the end I had to reformat the lot and restore it from backup and my BD disks... very painful and loooooong.

I've just rebuilt my WHS box using just the std SATA controllers (8 on the Mobo - between the Intel and ASMedia SATA controller) and added a cheap 4 port PCI Silicon Image controller to bring the total to 12.  Just plugged the disks in and all good (well I had some user issues involving Red Wine, late nights, formating the wrong disk and then not plugging stuff in correctly but......that is another story!!!)
Logged
JRiver CEO Elect

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #5 on: December 10, 2012, 09:45:27 am »

How do you intend to setup ZFS? You'd need a dedicated machine with some Linux distro and go through hell to set it up (unless you're experienced with unix/linux OS's) or go with an appliance-like distro which turns a pc into a NAS device, like Freenas.

Agreed.  If only OSX had actually gone forward with the plan to integrate ZFS.  Or something.

There are a variety of other interesting filesystems for POSIX-based systems, including btrfs, but none of them are what I'd call "ready for end users".  Certainly not Windows users.

If raid-5-like parity is what you want and you're not afraid to get your hands dirty, you may look at SnapRaid or DisParity. This works by means of a command line tool which you can run on a schedule to create parity blocks on a drive/folder. It takes some time to figure it out and its not the most user-friendly but it does work if you manage to set it up properly.

I still don't see how any of this (not just your comment) is easier than simply running a real RAID5/6 volume.  Intel's RAID controller is built into basically every motherboard you can conceivably buy, and works quite well.  It isn't as fast as a dedicated hardware raid card, of course, but it almost certainly beats these other "on-top-of-the-filesystem" style software systems.

I needed more drives, so I got a Highpoint card.  I hate their configuration tool, but I have to use it once in a blue-moon, and otherwise the card has worked perfectly.  If you feel you might need to muck about with it more, LSI is right over that way....

I don't know... I'd love it if I could afford a NetApp storage system or something, but for my needs, this worked out well, and can scale up to a truly impressive number of drives.

If, instead, I really wanted the flexibility to have single volumes where I could add additional drives (of various sizes) at whim?

Drobo
unRAID
FreeNAS

So.... I don't know.  I'm just ranting, I guess.  All of the "on top of the filesystem" systems make me nervous.  One little thing goes wrong, and everything could be hosed.  Sure, you won't lose "data", but having the volume broken is almost just as bad (metadata is data too).
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

InflatableMouse

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3978
Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #6 on: December 10, 2012, 10:05:17 am »

I still don't see how any of this (not just your comment) is easier than simply running a real RAID5/6 volume. 

Easier, not really. But it has advantages (and disadvantages for that matter) because disParity and the like create parity on-demand (snapraid, its not real time parity calculation) and use standard volume formats like NTFS. They work well with Drive Bender. They don't actually create a striped raid volume like a raid card does and therefore keep compatibility and flexibility. Some of these programs even allow creating parity blocks for folders instead of entire drives. Like Drive Bender you can combine any number of disks of any size.

I tried them, they work but I wouldn't use them either because I think its too much hassle. I like something more automated, shiny GUI's etc.

If people want to use motherboard integrated raid solutions its up to them. I burned myself 1 too many times using it and I will never, ever use it again. They are not to be trusted, very unreliable IMO.
Logged

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #7 on: December 10, 2012, 10:50:50 am »

If people want to use motherboard integrated raid solutions its up to them. I burned myself 1 too many times using it and I will never, ever use it again. They are not to be trusted, very unreliable IMO.

Intel's used to be problematic way back in the day (especially the Pentium-4 era), but they're fine now.  I ran a RAID-5 with 5 drives off of the Intel integrated RAID controller for a LONG time (on my Core i5-750 system) with absolutely no trouble.  Make sure you get the latest Intel Storage Drivers and you'll be fine.

In fact, if you dig around a bit, the Intel software RAID is actually a bit more reliable (and occasionally faster) than some lower-end dedicated cards.  You used to need integrated DRAM and the whole bit to get any kind of performance out of them, but CPU speeds (and RAM access latencies) have eliminated most of the advantages of external controllers.  The rule of thumb used to be "don't use software RAID".  But that was when we had single-core CPUs on Windows XP with terribly slow memory access latencies and most applications were single-threaded.  You needed that RISC chip with it's dedicated RAM because you might be trying to render a scene in 3DS Max and your little P4 CPU is going to be pegged and all your RAM is going to be used up.

We don't live in that world anymore, assuming you're not still running an old system like that.  Who cares if it causes 1-2% CPU utilization on one core and a tiny 256-512MB RAM hit?  Buy 8GB of RAM for $30, and your 4-core CPU is sitting there idle all of the time anyway.

Now... Many of the third-party SATA add-on controllers motherboard vendors slap onto their boards (like the Marvell ones Marko was having trouble with)?  Yeah, those are a completely different story.  I don't even like to use those ports at all, and if I do, it is only for simple external drives or optical drives.

But the Intel ones?

The main limitation on them is the number of drives you can add (though the new ones even support Port Multipliers properly).  And, I'll say this, the software for Intel's RAID management is WAY better than what Highpoint gives you.

And it makes sense...  That Intel IO chipset (despite being built on "old" process tech, by their standards) is probably a generation-or-two more advanced than the RISC chip on an external RAID card (which is almost certainly an off-the-shelf ARM core, built using the year-before-last's fabs, and then crammed into duty as a RAID controller).  Plus, with modern Intel CPUs, the controller and everything is on-die, so they're definitely at-least a node ahead (and probably more like two-or-three) of everyone else.  Intel knows how to design chips, and they have the money to make sure their driver support is up-to-snuff (and a huge enough market that it causes problems for them even with tiny edge-cases).

So, if you go with something high-end and enterprise class (like LSI), then, sure... But comparing them to the more low-end cards (Highpoint or LSI/Areca's cheaper models), you're basically just buying extra ports.
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

Daydream

  • Citizen of the Universe
  • *****
  • Posts: 771
Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #8 on: December 10, 2012, 05:39:16 pm »

I don't see how any standard type of RAID (implemented either software or hardware) is gonna cut it for anything home-related. Especially since the need is storage, lots and lots of storage, not speed.

With RAID you need the same type of HDDs, and the same quality - preferably high - of HDDs. Or you expose yourself to problems upon problems with 'Green' drives and the likes and their home-use aggressive features messing the RAID aray and the controller or plainly just not being compatible. I don't wanna put up with anything like that. And then it's the darn parity thing. You lose anything beyond the number of drives covered - you lose everything. Unacceptable.

What do we actually store - for the purpose of this conversation? Movies, TV series, music, photos. Music and photos, given their size can actually be backed up without much trouble. That can be UL to Crashplan (users that have the entire Universal's music catalog not covered by this discussion). Movies and TV Series. How many? I don't know I'll pull something out of my hat. 1000 movies and 6000 episodes. Those cannot be backed up. I would like to do parity for them.

Here comes the 'over the filesystem' solutions. Instead of needing to buy 20 drives of the same type and a controller and I don't know what else, I can add whatever number of drives I want, in whatever sizes, and keep on growing as I need. Then I can build parity snapshot style, and also grow the parity as I see fit (parity to protect against 1 drive failure, 2 drives, 3 drives, n drives till it becomes a backup, 2x backup, 3x backup - just kidding :) ). How many movies and episodes do I add between snapshots? Not that many, I don't plan to buy the entire Warner catalog or DL the entire Internet. If I lose something between snapshots so be it, it's gonna be a movie and 5 episodes. Pfffff...

You buy a piece of software (at worst, if free alternatives are not enough) and just HDDs. OK probably a couple of port expander cards too. And not spend money on hardware & solutions that would provide other things not exactly what I want (speed and limited parity but not freedom to expand the storage and parity).

Now. I understand there's no solution to satisfy everybody. If one records 5 TV channels at the same time and watch content in 5 different rooms also at the same time, then of course a user like that would be / should be more concerned with the speed of the array. But I would deem that to be a special situation, not mine. I'd think mine is closer to the common denominator of lots of storage, slowly growing, slowly adding content, not wanting to spend an arm and a leg defying common sense.

On specifics. I used to be a FlexRAID user when it was free. It was not without its oddities, most annoying to me at the time being that it needed Drive letters and didn't work with mount points and that it seemed unable to write a lot of small files at once (like in DL thumbs for a series episodes from thetvdb.com and writing all of them at once). These things seems to have changed for the better.

Regarding ZFS, yes my go to solution would be something like FreeNAS. It's not exactly my strong point, but I would like to learn it (preferably without destroying my media :)). Of course it makes no sense to learn about ZFS just for the task of protecting media files, but I'd like to learn it because I think it'll help in the long run let's say job related. Plus in plays along with my philosophy of 'use brains, don't spend money'. Interestingly enough the guy that developed FlexRaid is thinking about a NZFS (Not-ZFS) solution, that would mix and match features from both current implementations (RAID under, within, over filesystem).
Logged

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #9 on: December 10, 2012, 09:06:19 pm »

I've just rebuilt my WHS box using just the std SATA controllers (8 on the Mobo - between the Intel and ASMedia SATA controller) and added a cheap 4 port PCI Silicon Image controller to bring the total to 12.  Just plugged the disks in and all good (well I had some user issues involving Red Wine, late nights, formating the wrong disk and then not plugging stuff in correctly but......that is another story!!!)

FYI - Appart from my stupidity trashing my Drive Bender #1 disk, there were two issues that between them took me days to sort out:
- NIC Driver:  WHS (based on Windows Server 2008 R2) refused to install as it could not "find" a driver for my Intel Gigabit Ethernet 82579V Controller.  I had the discs, I had downloaded all versions of the drivers etc etc... Nothing... Turns out Intel don't what you using "consumer" grade stuff on their server products so in the install config file they removed an install option for the 82579V.  Strange thing is WHS is exactly aimed at consumers.  Anyway after many hours I found another in the same boat that had edited the install config file and away we went.
- Permissions:  Once it was all up and running I used SyncToy from my "Main" PC pool to the new WHS Pool to replace the files I had trashed on DB Disk 1.  All seemed OK, but when I ran Sync Toy there was aways more to sync and it would throw some funny errors.  After closer inspection it was always the same files and while I could see them on my WHS box I could not see them from my Main PC mapping of the pool.  Now ... I had used tried setting permissions using both WHS Dashboard as well as via Remote Desktop but it seemed to have stalled on a couple of attempts so I rebooted.  The end result a bunch of these files were not inheriting their permissions but were specific to accounts using SID's (if I have that term correct).  A bit more googling and I found the great options to "replace all child object permissions with inheritable permissions form this object" and within a couple of minutes it was solved.

Logged
JRiver CEO Elect

InflatableMouse

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3978
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #10 on: December 11, 2012, 12:10:15 am »

- Permissions:  Once it was all up and running I used SyncToy from my "Main" PC pool to the new WHS Pool to replace the files I had trashed on DB Disk 1.  All seemed OK, but when I ran Sync Toy there was aways more to sync and it would throw some funny errors.  After closer inspection it was always the same files and while I could see them on my WHS box I could not see them from my Main PC mapping of the pool.  Now ... I had used tried setting permissions using both WHS Dashboard as well as via Remote Desktop but it seemed to have stalled on a couple of attempts so I rebooted.  The end result a bunch of these files were not inheriting their permissions but were specific to accounts using SID's (if I have that term correct).  A bit more googling and I found the great options to "replace all child object permissions with inheritable permissions form this object" and within a couple of minutes it was solved.

If I understand you correctly, I had the *exact* same issue with permissions.
Logged

InflatableMouse

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3978
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #11 on: December 11, 2012, 12:32:20 am »

I would like to do parity for them.

DisParity's latest build comes with a GUI. Seems to work fine and I think I'll be using it to see how it works on the long term. I haven't looked into it yet but I think for the scheduled calculations we still need to work with scripts. Just configure a disk/folder for storing the parity files and add the disks 1 by 1 then calculate the parity for them. Its actually very easy to use.

But just to be clear: this is NOT backup. I don't care what they call it on their website, it is NOT backup. It brings an increased risk and its stored on the same machine (technically you don't have to though). It is a safeguard against deleted or changed files but if files are changed after taking a parity snapshot that are needed to restore another file, it will fail. Disparity allows the exclusion of certain files and I think it works best if you would exclude all sidecar files as they change when fileinfo changes (not to self: something I need to look into to as well).
Logged

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #12 on: December 11, 2012, 02:42:21 am »

If I understand you correctly, I had the *exact* same issue with permissions.

Sound like it could have been the same thing.  I did not track it fully enough to work out what files where on what drive etc... but my feeling it was on the DB1 drive that I stuffed up (my own fault) when I was readding the pool on the new HW.  Like all such things, the fix only took about 2mins but it took ages to work out what was wrong in the first place!
Logged
JRiver CEO Elect

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #13 on: December 11, 2012, 02:44:18 am »

FYI, for those wanting Parity, DB is asking for feedback on their plans to add RAID ... http://forum.division-m.com/index.php?topic=1390.0  I said it was a poor idea but I'm sure others will think it is good!
Logged
JRiver CEO Elect

MrHaugen

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 3774
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #14 on: December 11, 2012, 03:14:16 am »

I don't know about you, but I am plenty happy with my setup. It's a 24 bay, raid 6 solution with 2 arrays on 24 TB (2TB drives).

I bought a Norco DS-24E. It cost around $1000. The you add the drives needed (up to 24), and a raid card. And you have a solid semi-professional setup. All you need is a hardware Raid card with one mini-SAS connector. I don't think that those should cost more than 250-500 bucks?

As this is an expander case, you can in theory just add disks and increase the capacity as you like. There should be no limitations when it comes to drives. The most cost effective drives now are probably the 3TB drives? That gives you 72 TB total. You can create two arrays (as I have) and sync the data between them. This will give you a very secure storage solution. If you ever hit space limitation, you can just purchase another expander box, and serial link them.

The good thing about this is that you can just increase the storage as you need. It can even do this when using the storage. This gives you very good speed (dependent on the raid card), and it's secure. And it's quite reasonable priced imo. It's an investment, sure. But it's an investment that you will have for a long time.
Logged
- I may not always believe what I'm saying

InflatableMouse

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3978
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #15 on: December 11, 2012, 03:30:23 am »

It cost around $1000....

The you add the drives needed (up to 24), and a raid card. And you have a solid semi-professional setup. All you need is a hardware Raid card with one mini-SAS connector. I don't think that those should cost more than 250-500 bucks?

With the amount of cash you dished out I should sure hope you're happy but not everyone can afford that or is willing to drop that kind of cash.
Logged

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #16 on: December 11, 2012, 03:31:30 am »

I was keen on the Norco soln (a fav for many users by home builders) but has got lots of bad press recently on poor quality frying drives - http://wsyntax.com/cs/killer-norco-case/
Logged
JRiver CEO Elect

InflatableMouse

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3978
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #17 on: December 11, 2012, 03:35:14 am »

FYI, for those wanting Parity, DB is asking for feedback on their plans to add RAID ... http://forum.division-m.com/index.php?topic=1390.0  I said it was a poor idea but I'm sure others will think it is good!

Of course its a good idea, as long as you can turn it off if you don't want it.

If they implement it the way I expect, then parity in this setup is almost perfect as it keeps your drives as stand alone drives with standard NTFS volumes. For situations where you don't/can't backup the entire volume, parity really is a good alternative with minimal risk increase.

I'll add my vote to that thread to compensate for your skeptisism :P ;).
Logged

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #18 on: December 11, 2012, 03:36:49 am »

The other "low tech" solution is that as MC abstracts drives anyway you can just use single HDD without any HW or SW Raid or Pooling Solution.  Eg my WHS box is on a Asus P8Z77-V Pro that has 8 Sata Ports.  I then added 4 more with a cheap PCI card giving me 12 ports.  I've used a case with 6 Hot Swap Bays + added a 5 Drive in 3 bay chassis.  This give me 11 Drive Bays + 1 for a BD Drive or E-Sata.  This gives heaps of space for most (eg I'm moving to 4TB HDD).  I do use DB to pool the drives to keep mgt easier but that is really not needed for MC.  Once you fill up a drive with content just move to filling the next one.
Logged
JRiver CEO Elect

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #19 on: December 11, 2012, 03:47:03 am »

Of course its a good idea, as long as you can turn it off if you don't want it.

If they implement it the way I expect, then parity in this setup is almost perfect as it keeps your drives as stand alone drives with standard NTFS volumes. For situations where you don't/can't backup the entire volume, parity really is a good alternative with minimal risk increase.

I'll add my vote to that thread to compensate for your skeptisism :P ;).

Good Man!  Always good to expand the conversation and Yes they have said they will introduce it as a selectable option.  You had best post your opinions though as it is about 5 to 1 against!  I'm not against the option but would rather see their dev effort go into other features like better SMART monitoring and drive failure prediction.  I'd rather replace the drive before it fails.  On that note, I've started removing my older 2TB WD Greens that have over 1,000 days up time.  I'm now waiting for more sales at B&H for the Hitachi 4TB to get them back under $200 (I already have 12). 

I don't mind the RAID concept in general but I've decided for a separate back.  As such RAID is a PITA.  Say I did go raid on my two pools and one drive fails.  I could then replace the drive and it would take about 80 hours to rebuild from Parity.... Or I could replace the drive and resync the missing content from the second pool in about 12.  What worries me is the perceived reliance than RAID is backup.  Nothing beats a backup (if you can afford it that is) + it is quicker, simpler and covers in for more than a single drive failure (eg multi-drive failure, inadvertent deletions/changes, theft, fire etc)
Logged
JRiver CEO Elect

MrHaugen

  • Regular Member
  • Citizen of the Universe
  • *****
  • Posts: 3774
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #20 on: December 11, 2012, 03:49:57 am »

I was keen on the Norco soln (a fav for many users by home builders) but has got lots of bad press recently on poor quality frying drives - http://wsyntax.com/cs/killer-norco-case/
That's not good at all. Thanks for the link. I'll be sure to investigate more before I upgrade to 3 og 4 TB drives then! Scary stuff.


InflatableMouse, I don't disagree that it's a lot of money. But you get a lot for the bucks if you can cope with this almost "one time" investment. This mosfet problems however, might put a dampener on this.
Logged
- I may not always believe what I'm saying

Mr ChriZ

  • Citizen of the Universe
  • *****
  • Posts: 4375
  • :-D
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #21 on: December 11, 2012, 03:52:10 am »

With the amount of cash you dished out I should sure hope you're happy but not everyone can afford that or is willing to drop that kind of cash.

How much does 20TB of movies, shows and music cost?  :)

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #22 on: December 11, 2012, 03:52:46 am »

That's not good at all. Thanks for the link. I'll be sure to investigate more before I upgrade to 3 og 4 TB drives then! Scary stuff.

There has been a bit on this and what version of Back Planes they have used.  They disti even pulled them from sale down here in Oz from what I read.  Really this stuff should be easy and bullet proof but as we have seen in this thread nothing is that simple though it escapes me why!
Logged
JRiver CEO Elect

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #23 on: December 11, 2012, 04:07:35 am »

How much does 20TB of movies, shows and music cost?  :)


Here is the calc I used:
- BD = 40GB on average and cost from say $10 (for older titles) to $30 for new releases
- 4TB hold 100 BD and at $200 per drive this costs $2 per BD and another 2TB if you want to "back it up" to other HDD

To this you need to then add the cost of your "servers".  To keep the math easy lets say each one costs $1K and in my case each manages say a 24TB pool.  So rounding down that is another $1.5 per BD + $1.5 per BD for the backup server.  I also then added by making my own getto UPS ($200 for a AGM Battery, $100 for a pure sine wave inverter, $300 for a quality 20a 230V to 13.8(variable) power supply) which gives my 5 hours UPS... So say another $1 per BD.

So they way I figure it is buy a BD player for under $100 and just use disks or.... or at the other end of the scale have the the "fun" of a fully backed up networked library that can server all your content for but that adds about $8 per BD.  Of course these costs come in big "lumps"!
Logged
JRiver CEO Elect

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #24 on: December 11, 2012, 04:14:32 am »

To balance this "obsession", it all gets trivial if you only want to do this with Music (you could use a thumb drive this days) or even DVD or low quality video rips.  Another valid view was a guy at work who when I explained my setup asked why I bother as you only watch most movies once..... so why store them anyway!

..... it's not an obsession, it's a hobby and so it should be done "right"!
Logged
JRiver CEO Elect

Hendrik

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 10944
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #25 on: December 11, 2012, 04:27:32 am »

While i was too lazy to read the whole topic, i run a FlexRAID system with ~25TB of usable space right now. Its been running for 6 month or so now and never had a single issue with it.
Its a hell of a lot easier to setup then a RAID 5, and what was a bonus for me is that its the only RAID solution which supports mixed-size drives, so i could keep my 2TB drives and add some new 3TB drives in the mix.
As a bonus, you could just turn it off and still access all the files, so even if you have a catastrophic failure, only the hardware that failed is gone, and not everything like in a striped array.

So yes, it might not be as fast as a dedicated striped raid 5, its limited to the speed of your individual discs, and no less. For just a plain stupid media storage, thats plenty. I can usually still saturate my GBit home network if i move stuff.

PS:
Personally i wouldn't trust those "fake raid" controllers one bit (the Intel "RAID" controller on ever MB), rather use a pure software raid (mdraid if you happen to run Linux), or even go with Windows Server 2012 and use MS storage pooling features, or well, FlexRAID/SnapRaid/unRAID or any other third party raid component.

"Fake Raid" because they claim to be HW controllers, but all the raid operations are done in software through the firmware/drivers.
Logged
~ nevcairiel
~ Author of LAV Filters

jmone

  • Administrator
  • Citizen of the Universe
  • *****
  • Posts: 14465
  • I won! I won!
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #26 on: December 11, 2012, 04:28:37 am »

If you want an "easy" life you could get a fully Loaded Drobo at approx $5 per BD (+ 5 for a second one for backup) - http://www.bhphotovideo.com/c/product/843829-REG/Drobo_32TB_8x4TB_B800fs_8_Bay.html
Logged
JRiver CEO Elect

ikarius

  • Recent member
  • *
  • Posts: 20
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #27 on: December 18, 2012, 02:14:51 pm »

I'm going to chime in here- I know a fair bit about storage.  I'd say that ZFS is the best "software raid" implementation I've seen, though it's far from perfect.  If you're doing a white-box "roll-your-own", I strongly suggest ensuring you're using a LSI 92xx SAS/SATA HBAs to connect your drives. One interesting approach I've contemplated is building a reasonably beefy system, loading VMware ESXi on it, running a solaris VM, and using PCIe passthrough to hand LSI HBAs to the solaris VM. Then you can run a windows or linux VM alongside for whatever other software services are needed, and still get pretty reasonable performance for your RAID network server.   

If you'd like to simplify things, there's only one vendor with moderately reasonable out-of-the box solutions;  synology. Their DS1512+ is a pretty darn good raid server in-a-box, and they're far superior to everything else, both from software standpoint, and from a performance standpoint.



Logged

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #28 on: December 18, 2012, 06:48:06 pm »

I'm going to chime in here- I know a fair bit about storage.  I'd say that ZFS is the best "software raid" implementation I've seen, though it's far from perfect.

Totally agree.  I wish there was a good (commercial) implementation for Windows, or that Microsoft would have really released ReFS.

I strongly suggest ensuring you're using a LSI 92xx SAS/SATA HBAs

I'm almost certainly going to replace my RAID card in the next year.  I'm leaning towards an LSI card.  Any comments on them versus the Areca competitive offerings?  I'd be looking at a lower-end RAID 5/6 supporting card with two external miniSAS ports, probably.  I'm not going Fibre Channel because (1) I'm not crazy, (2) even iSCSI would give me almost the bandwidth I'm looking for (though, of course, more = betterer), and (3) that'd be silly, as there will be better Thunderbolt standalone options before too long (at least as long as Intel remains as dominant as they are), which would be faster and cheaper in the end if they build controllers specifically for it, rather than tacking on existing SAS solutions and bridging them.

I can also second the Synology rec.  I've played with one at work and it was very nice, and AnandTech has given them pretty glowing reviews for home use.  It did seem, as most of those things do, that it had way more "stuff" tacked on than I'm comfortable with, and I'm looking for local storage, not a network appliance.
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #29 on: December 18, 2012, 06:54:44 pm »

Totally agree.  I wish there was a good (commercial) implementation for Windows, or that Microsoft would have really released ReFS.

As uncouth as it is, I'm going to quote myself...

Except that, I'd really rather that Apple actually finished their implementation.  OSX needs a better filesystem worse than Windows does (though both are pretty ancient and creaky), and then you could just share it over SMB.  Plus, then third parties would have had more motivation to build Windows ports, as they do for HFS+ now.
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

ikarius

  • Recent member
  • *
  • Posts: 20
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #30 on: December 19, 2012, 12:15:26 pm »

On hardware raid, I'd give the edge to Areca.  Areca's hardware raid is as good as anyone elses (which isn't saying a lot), and their configuration UI is substantially better than either LSI, Adaptec or Highpoint. That being said, I have a hard time recommending any hardware raid card, and I'd personally always go LSI 92xx HBA + software raid for any white box solution I built.  I agree with you that it's shameful that the major OS vendors haven't managed to build a better storage stack to date, and I was really disappointed to see Apple abandon their ZFS implementation.

On the Synology side, I wouldn't let the list of tack-ons deter you.  I have one myself, and they have done an amazing job of putting it together "right".  It is a captive Linux OS underneath, but they have put a truly excellent UI on it, and most of the extras don't run by default.  You install/turn them on if you want/need them, but it's very clear that job #1 for the synology is network storage, and it does not compromise that functionality for the sake of extra features they can advertise.
Logged

InflatableMouse

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3978
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #31 on: December 19, 2012, 01:10:12 pm »

I agree Synology is probably offering the best NAS systems out there for home use. Having said that, the hardware is barely capable. An Atom with just enough memory to keep it running and you pay a premium price of around 180 dollars per bay. I'm sorry but I think that is absurd.

For that price I can build an i5 with 16GB of memory and add a raid card. It will have a fanless Seasonic PSU, a nice tower model with room for 8 3,5" disks. FreeNAS or something similar will do just fine, or run the free ESXi with FreeNAS in a VM and give it access to some of the disks. That will leave you room to add more VM's for instance.

But thats just to show what I can do for that money. If you don't want that you can still build something just as "powerful" and as efficient as a Synology NAS for half the money.

I had a Thecus 5200XXX, supposedly the "professional" line of Thecus NASes. I think the hardware is pretty similar to Synology's but the software is somewhat outdated, at least thats how it looks. After a few days of playing around I found it seriously lacking in terms of multitasking power. All the benchmarks mean nothing if the box can't deal with more than 1 copy job and an Atom simply isn't up to that job IMO. The system just hogged and using it as a download station is simply out of the question for the same reason - it can't do more than 1 thing at once. With 2 large copy jobs the management page sometimes froze up for seconds and when it updated it was clear from the stats it wasn't able to cope. It can't use SSL for downloads because it doesn't have the power to decrypt faster than ~6MB/s; I download at 14MB/s so go figure. I actually knew this because my old Asrock Atom htpc had the exact same issue with SSL (its software decryption and the Atom can't cope).

To me a NAS is just a horribly overpriced device and vendors simply profit from illusion that a NAS is a simpler device (plug and play) compared to building one yourself. They build the hardware as cheap as possible but powerful enough to make it feel lackluster in the interface and they count on the fact that it will be mostly a single user storage device (or used by appliances like mediaplayers that don't need bandwidth). But really, no NAS is plug and play and if you manage to setup a proper RAID volume in a NAS, you can manage to setup FreeNAS with ZFS too. You only have to add building a system and finding the right parts to add to it but it saves you either half the money, or gives you an infinitely more powerful system to play with.

My 2 cents worth :).
Logged

glynor

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 19608
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #32 on: December 19, 2012, 01:47:40 pm »

On the Synology side, I wouldn't let the list of tack-ons deter you.  I have one myself, and they have done an amazing job of putting it together "right".  It is a captive Linux OS underneath, but they have put a truly excellent UI on it, and most of the extras don't run by default.  You install/turn them on if you want/need them, but it's very clear that job #1 for the synology is network storage, and it does not compromise that functionality for the sake of extra features they can advertise.

I've recommended them myself to others.

For me, personally, it's awfully tough to push this stuff through any NAS, so I have to have local storage on the big box.  iSCSI can do a stream or two of 422HQ, but in my experience, with a NAS network stack in the way of the I/O, you can't sustain 220Mbps+ throughput (and certainly not the 330Mbps+ that 4444 at 1080p30 uses).  My current Highpoint based RAID gives me plenty to squeeze two simultaneous streams through.

Thanks for the rec though.  I'll look more closely at Areca.  I'm certainly going to consider just getting dumb HBA and doing software RAID.  If I do this, would I just use their software to do the software RAID, or is there a good high-performance third-party software solution?  I'm looking for RAID5/6, the best read/write speeds I can manage (within reason, with a target of sustaining at least 400-500Mbps, which my current array does fine), and some software that doesn't make me want to tear my eyes out.

I don't care about having to have uniform disks or all of that.  I do want Online Expansion though (RAID Level Migration+Capacity).
Logged
"Some cultures are defined by their relationship to cheese."

Visit me on the Interweb Thingie: http://glynor.com/

ikarius

  • Recent member
  • *
  • Posts: 20
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #33 on: December 19, 2012, 02:28:16 pm »

Heh. You're absolutely right, NAS generally does not get you the throughput for real video editing work.  Last time I worked on a video editing setup, it was really serious stuff; Fiber Channel based, multiple LSI (now NetApp E-series) storage arrays, a solid-state caching layer, and ~20 different workstations (primarily Mac and Linux clients) talking to a NexStore distributed filesystem.   That was designed to support multiple folks simultaneously working on video editing for one of the major movie studios, however.

If the files you're working on are small enough, I'd actually ask if your workflow could reasonably work locally on Solid-State storage, and save final results on networked RAID.  That's what most of the smaller scale approaches I've seen have gone towards.

As far as 3rd party software raid goes, I'm afraid I've got little to offer; the solutions I've worked on were either using built-in OS capabilities, or they were high-end fiber channel solutions, using enterprise software.

A couple of other bits for you;  generally, any system which implements "modern" features; point-in-time snapshots, thin provisioning, etc costs extra IOPs, and tends to take you further away from the hardware's sustainable throughput. This includes ZFS, btrfs, etc.  In the case of ZFS, those features are always-on, so ZFS isn't a great candidate for this sort of work. ZFS tends to do quite reasonably on initial writes, but as soon as you're overwriting a file, the throughput drops tremendously.  You want to be highly aware of your data alignment;  if you build a 4 data + 1 parity RAID-5 set, and you're using a chunk size of 64k, that means that a raid stripe is 256k wide.  The first time you write to some space, that's fine, but if you go back and write somewhere in the middle of that raid stripe, the RAID protection is going to need to read the data on either side (to have the complete stripe data) before writing out the parity bits.  This is why any solution involving substantial random write activity really needs RAID10 rather than RAID5/6.  

One other approach you might consider would be infiniband; infiniband adapters and switches are relatively inexpensive, and can sustain the sort of throughput you're looking for.  A Linux server with a decent disk subsystem and mdadm for raid + NFS serving it across infiniband might be an effective solution for you.  Ethernet doesn't tend to scale well above 1 gigabit.  Because ethernet is not an arbitrated protocol (anyone can transmit at any time), to get anything approaching wire-rate throughput requires complex back-off/retransmit algorithms, and a lot of buffering silicon in the switch; scaling the speeds up requires exponentially more silicon in the switches.  This is why 10 gigabit ethernet ports are still extremely expensive, while 8 gigabit fiber channel is cheaper than 4 gigabit fiber channel was 4 years ago, and 16 gig fiber channel is already out, and likely to be as cheap as 8 gig fiber channel is today within ~3 years.  Infiniband is also an arbitrated protocol, so it's cost/gigabit scales reasonably.

If you want to chat further, feel free to PM me  for my contact info.  I'm happy to share things I've learned.

Cheers
  Ikarius
Logged

InflatableMouse

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3978
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #34 on: December 20, 2012, 10:47:16 am »

With the two of you present in one thread and this thread being one big hijack already, I was just wondering if you guys would think if this is a silly idea or actually viable. I tend to think outside the box :D.

Some background:
A while back I decided not to separate storage from the HTPC but run it as a HTPC/Storage server combined. 5 2TB disks with Drive Bender should do the job with Windows serving the fileshares.

Few months later I'm not that happy with Drive Bender after all and I'm looking for ways to get rid of it.

Today:
I'm wondering if running a virtual machine on the HTPC and using the raw disks in it with FreeNAS is viable. Well, either the raw disks or simply keep them as NTFS volumes and place a virtual disk on each.

I'm testing this with small (5 100GB) virtual disks, 2 cores and 4GB memory. I can give it more memory if needed that pc has 16GB total (Cpu is i5 Ivy). Performance leaves something to be desired with a simple copy job ... I'm not entirely sure which way to go with this, if at all.

What do you guys think?
Logged

ikarius

  • Recent member
  • *
  • Posts: 20
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #35 on: December 20, 2012, 11:59:53 am »

As far as running a virtual machine goes, most of the VM solutions which run on top of a normal OS don't provide great IO performance.  VMware ESXi provides reasonable performance, and Windows Azure may provide decent performance (I don't have experience with it, but have reason to believe it would).  Freenas isn't bad, if you can sufficient IO performance for your application.  I'm not saying that running a VM solution on top of a normal OS (vmware server, virtualbox, etc) won't be sufficient, only that it's performance is going to be lower than the other approaches; you'll need to do tests to see if it's enough for you.

Are those 100gig disks physical spindles, or multiple virtual disks on a larger physical disk? If they're virtual on a single physical, that's going to hurt performance substantially.  Do you have enough physical disks to set up those 100gig disks with one on each physical disk?

You might be able to get away with the solution you're suggesting.  You might be better served with VMware ESXi and running both freenas and your windows OS as virtual machines under it.  It's hard to say without trying it out.
Logged

InflatableMouse

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3978
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #36 on: December 20, 2012, 12:30:49 pm »

Thanks.

The virtual disks are each on a separate physical disk.

I've considered running ESXi but there are 2 issues I can't live without and I don't know how good or bad it would work (if it works at all): bit-perfect audio and refresh rate sync to video fps. Neither work on Vmware workstation but with ESXi and a decent hypervisor it might map the hardware directly to the Vm ... I don't know. It would be a really big hassle to try it out and not without risk. Posts on the Vmware forums have gone unanswered.

Would you know if direct access to a disk from a vm provides performance benefit (compared to having vmdk's)?
Logged

InflatableMouse

  • MC Beta Team
  • Citizen of the Universe
  • *****
  • Posts: 3978
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #37 on: December 20, 2012, 01:41:14 pm »

Ah just realized another reason for not running ESXi ... K-series of core processors do not support VT-d, a requirement for VMDirectPath (Directed I/O).
Logged

ikarius

  • Recent member
  • *
  • Posts: 20
Re: Cost effective way to maintain a healthy array, 20TB or bigger?
« Reply #38 on: December 20, 2012, 02:05:17 pm »

Ah, my apologies.  You're talking about having your HTPC and your storage server on the same hardware.  I'm afraid ESXi isn't a terribly good fit for that.  ESXi is not very good for dealing with video or sound hardware.  What is more likely to work well is a client-server model media service, and running the server portion of the media on the same hardware as the storage.  I have ended up using JRiver for my music, and streaming the music to my sound system wirelessly with an apple airport express.  I ended up going with Plex for my video content, as plex does a pretty darn good job at implementing a client-server model, and allowing me to maintain a centralized server which streams the content to a rendering device.  I have Roku and a tablet which run Plex clients, to render the video where I want.  I looked at JRiver's video capabilities for quite a while, but I have concluded DLNA is just a poor protocol.  This keeps my Media server stuff separate from the client end, so the server end doesn't have to worry about rendering the sound & video.
Logged
Pages: [1]   Go Up