[Bioclusters] A set of notes on how to do RAID optimization on a compute node

Joe Landman bioclusters@bioinformatics.org
Mon, 12 Aug 2002 15:09:42 -0400 (EDT)


Hi Simon:

On Mon, 12 Aug 2002, Vsevolod Ilyushchenko wrote:

> Joe,
> 
> This is a very informative article. Thank you. A couple of followup 
> questions:
> 
> 1. How would I do a software raid with more than two disks. That is, 
> what PC hardware configuration to have more than two independent 
> controllers. The only way to get them that I know of is through a RAID 
> card, but the card will be the bottleneck then.

The PCI bus will be your limiting factor for IDE.  If you build a SCSI 
based system, using Ultra 160 SCSI on a PCI-66 (266 MB/s bus), you 
normally would achive about 200 MB/s sustainable bandwidth (the budget you 
have to work with).  If each disk can talk at 40 MB/s (10k RPM disks can 
do this), then you can in theory have 5 disks in a RAID0 stripe (just add 
more 

	device	    /dev/disk
	raid-disk     N+1

lines in your /etc/raidtab.  The /dev/disk is the disk device, and the N+1 
is the next raid disk number (starting from 0).  So if you have 5 scsi 
disks, sda to sde, all using partition 2 for the file system, your 
/etc/raidtab file would have the device section looking more like this:

        device      /dev/sda2
        raid-disk   0
        device      /dev/sdb2
        raid-disk   1    
        device      /dev/sdc2
        raid-disk   2    
        device      /dev/sdd2
        raid-disk   3    
        device      /dev/sde2
        raid-disk   4    

Of course, your chunk-size parameter will need to be optimized, and you 
will need to change the nr-raid-disks to 5.

This will give you RAID0, 5 way striping.

For IDE, you can purchase an inexpensive ATA100 IDE card for PCI, and put 
2 disks on it (one on each channel).  Together with another disk on the 
main IDE channels, you can get a 3 way stripe, which should put you pretty 
close to 100 MB/s.

> 2. In general, how much of a speedup do you expect if you use 2-disk and 
> 5-disk software raid, compared to the same disk drive used standalone?

Raither hard to answer a general question like this:  Which version of 
RAID ... RAID 0?  1, 3, 5?  Which file system, what type of access pattern 
(large sequential block reads, versus smaller random reads and writes).

This quickly gets into a discussion of how to tune for a specific range of 
applications.  My company, Scalable Informatics (formed after I left 
MSC.Software), could help answer some of these.

Let me know if you want to talk.  Thanks Simon!

Joe

> 
> Thanks,
> Simon
> 
> Joe Landman wrote:
> > Hi folks:
> > 
> >   Chris had asked about methods to to RAID optimization on compute
> > nodes.  The idea being that he could get good/better performance out of
> > his system if he tuned various options in the software RAID0 under
> > linux.
> > 
> >   The following URL is a short writeup I had been working on for a while
> > now.  
> > 
> > http://scientificappliance.com/scalable_fs_part_1.html
> > 
> > If you have any questions or comments, please feel free to send them to
> > me.   Thanks!
> > 
> 
> 
> 

-- 
Joe Landman,
email: landman@scientificappliance.com
web  : http://scientificappliance.com