[Bioclusters] mid range storage solutions

Guy Coates gmpc at sanger.ac.uk
Mon Oct 1 05:28:24 EDT 2007

Bonnie Hurwitz wrote:
> Hi all,
> I was just wondering if anyone has a recommendation for mid-range
> storage.  We need to purchase a storage server for our cluster to act as
> a mySQL database server that has around 32 terabytes of disk space
> utilizing Raid10.  Also, we are looking for fast disks and a 10-20Gb
> card since this is meant for a database server and we want to try to
> minimize resource contention from writing to the db server from the
> nodes.  We currently have 500 nodes on our cluster.
> What are people currently using for similar database servers?  What has
> the performance been like when writing to the databases from compute nodes?

It is quite easy to overload a well tuned, beefy  mysql server from a small
compute farm. There are several things you can do to increase the server

1) Use innodb rather than myisam tables.  myisam tables have some really nasty
performance bottlenecks. update and deletes require an exclusive table level
lock, so if you have lots of jobs trying to update a database at the same time,
performance will be abysmal. innodb does not suffer from these issues, so you
should use it.

innodb is also a more robust data format, so when you crash your database, you
don't have to wait an age whilst your myisamchk all your tables.

2) Bump up the various mysql  buffer sizes; key_buffer_size /
innodb_buffer_pool_size are the important ones, but be aware that you can't set
key_buffer > 4GB, otherwise you'll crash the database. (See the links below for
a full explanation of what these do.)

3) Think about throttling your jobs. Here at Sanger, we feed database load
information into our queuing system.  We use a rough metric  of
load=(number_of_connections  + (number_of_queries*10)).

The queuing system can then use this load information to throttle job execution
on the cluster and prevent the database from being overwhelmed.

There is a good selection of mysql performance tips here:






Dr. Guy Coates,  Informatics System Group
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1HH, UK
Tel: +44 (0)1223 834244 x 6925
Fax: +44 (0)1223 496802

 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

More information about the Bioclusters mailing list