[Bioclusters] Urgent advice on RAID design requested

Joe Landman landman at scalableinformatics.com
Fri Jan 19 07:54:23 EST 2007

Malay wrote:

> Interesting discussion everyone. My limited experience says given the 
> price of redundant but cheap systems and reliable but expensive system, 
> one should go for cheapest systems that serves your purpose and 
> redundancy than reliable and more expensive system. To elaborate, two 

There is a story running around about this.  Some airplane manufacturer 
built a small plane with your choice of engines.  First engine was a 
single (unknown manufacturer) high quality and more expensive turboprop. 
  Second was a dual (also unknown manufacturer) "reasonable" quality 
piston engine.  Turns out that the company sold the benefits of "cheaper 
but redundant" to its audience.  The buyers who purchased them, looked 
at the statistics for failures, noted that even with the redundant pair, 
if one failed, you were pretty much in quite a bit of trouble.

The point being, if you are going to bet your life, or your data on 
something, it makes sense to go with hard data as compared to speculation.

The cheapest drives around, Maxtors and their ilk have seen failure 
rates higher than 3-4% in desktop and other apps.  Sure, you will save a 
buck or two on the front end (acquisition).  Unless you can tolerate 
data loss, do you want to deal with the impact on the back end?  Without 
  trying to FUD here, how much precisely is your data worth, how many 
thousands or millions of dollars (or euros, or ...) have been spent 
collecting it?  Once you frame the question in terms of how much risk 
you can afford, you start looking at how to ameliorate the risk.

There are simple, (relatively) inexpensive methods.  N+1 supplies adds 
*marginal* additional cost to a unit.  Using better drives (notice I 
didn't say FC/SCSI/SATA), adds minute costs to the unit.  Using 
intelligent redundancy (RAID6 with hot spares, mirrored,...) reduces 
risk at an increase in cost.

We are not talking about EMC costs here.  Or NetAPP.  If you are 
spending north of $2.5/GB of space you are probably overspending, though 
this is a function of what it is and what technology you are buying.

> separate machines with cheap components (chapest SATA drives with single 
> power supply) is better that one expensive machine (higher quality hard 
> drives, redundant power supply). What you Gurus say?

I believe that you can save money at the most appropriate places to do 
so.  Im not sure this is it.  Its your data, and you have to deal 
with/answer for what happens if a disk or machine demise makes it 
un-recoverable.  People whom have not had a loss event usually dont get 
this (e.g. it hasnt bitten them personally).  If you have ever lost data 
due to a failure, and it cost you lots of time/energy/sweat/money to 
recover or replicate this, you quickly realize that the "added" cost is 
a steal, a bargin in comparison with your time.  Which you should value 
highly (your employer does, and rarely do they want you spending time on 
data recovery, unless this is your job, as compared to what you are paid 
to do).

> -Malay
> _______________________________________________
> Bioclusters maillist  -  Bioclusters at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615

More information about the Bioclusters mailing list