[Bioclusters] Need some advice on a cluster for EST/cDNA assembly, clustering

Tim Cutts tjrc at sanger.ac.uk
Mon Feb 28 04:02:07 EST 2005

On 27 Feb 2005, at 6:37 pm, Elia Stupka wrote:

> I've been hoping to be able to access large amounts of memory on 
> affordable servers for a while, in reality, though, the 4GB OS limit 
> has hardly been the issue since unfortunately the cost of memory is 
> still very high and hardware vendors seldom offer more than 4GBs per 
> processor.
> The Sun Opterons are the only ones (among the mainstream vendors) that 
> offered us a 4-way option with 32 GBs. The Apple G5s are still limited 
> to 8GB (4 per processor, probably when Tiger will be truly released 
> they will finally offer more memory slots?), IBM Opterons offer 16GB 
> (still only 4GB per processor), the blade versions are always limited 
> in memory, etc... then you are left with the usual suspects (Power5s, 
> etc.) who have been dealing with more memory for a long time, but at a 
> nasty price...
> ...as long as it costs more to equip hardware with good amounts of 
> memory than it costs to buy the hardware, the refinement of 64-bit OS 
> for access to large amounts of memory can't take off properly, can it?

There's always the option of a mixed architecture cluster; buy lots of 
cheap 32-bit boxes for the vast majority of compute tasks which can run 
in a 4GB address space, and then spend the money saved to buy one or 
two large and much more expensive machines to handle the very large 
memory tasks.  This is the approach we have taken; we have about 500 
dual processor Pentium IV machines, with only 2GB per processor.  
Almost all our workflow goes through those machines, but the LSF 
cluster also contains a couple of huge memory machines for the other 
stuff - two SGI Altix 350 machines, with 192 GB of RAM each, one with 4 
CPUs, the other with 16.  Obviously, 192GB of memory is *seriously* 
expensive, and not what a lot of groups would be able to afford, but 
smaller Altix boxes are really quite affordable [ as Itanic machines go 
], and if you want a 64GB machine with, say, 4 CPUs, they're only 
mildly expensive  :-)

The main issue with several types of machine in the same cluster is of 
course that the users then have to specify their requirements to the 
DRM so that it schedules to the right kind of machine.  And I don't 
know about you guys, but ours are notoriously bad at estimating what 
their jobs' requirements actually are.


Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233

More information about the Bioclusters mailing list