[Bioclusters] requesting help for computational server setup (PE 4600)

Aaron Darling bioclusters@bioinformatics.org
Wed, 17 Sep 2003 16:36:22 -0500 (CDT)


On Wed, 17 Sep 2003, Joe Landman wrote:

> On Wed, 2003-09-17 at 15:57, Aaron Darling wrote:
> > I would suggest taking your disks out of RAID 0.  See:
> > http://www.storagereview.com/php/tiki/tiki-index.php?page=SingleDriveVsRaid0
>
> This review isn't all that helpful to the informatics folks though.
> Many of the codes I have seen are limited by large block sequential
> reads, which the article indicates to be one of the strengths of RAID0.

Yes, many bioinformatic codes perform large amounts of sequential I/O and
thus stand to benefit from RAID 0.  It is also an attractive option given
that sequence databases are huge and RAID 0 yields maximum drive capacity.
However, in a multi-user environment two separate programs performing
sequential I/O will cause contention for the drive arms if the users are
requesting data from different regions of the disk.  To make an
informed decision, we would need an accurate characterization of the usual
I/O pattern on the server.  We can certainly contrive situations where any
one of {JBOD,0,1,1+0,5} look best.  In general though, I argue that
RAID 1 would provide the best option for a server with 2 disks if disk
space can be sacrificed.  If disk space is at a premium, it may be
possible for the sysadmin to divvy up sequence databases for different
apps across the drives in a JBOD configuration using knowledge of what the
4 users will be doing.  Ideally, doing so would isolate workloads to each
disk in a way that mitigates the cost of drive arm contention...

-Aaron