[Bioclusters] cluster or SMP

Chris Dagdigian dag at sonsorol.org
Tue Sep 4 12:46:09 EDT 2007

Diskless clusters are generally a bad idea for any system that will  
be running life science informatics workflows. In general terms  
(without knowing your specific applications) it is generally a rule  
that most life science codes are performance bound by memory and disk  
access I/O limitations. If you build a diskless cluster and do lots  
of assemblies on genomes that are larger than what can fit in memory  
your cluster is going to be bogged down while nodes wait on network I/ 
O requests.

You can always tell a HPC vendor who knows nothing about life science  
because they try to sell you (a) MPI-optimized "beowulf" systems or  
(b) diskless systems, both of which are totally (in most cases)  
inappropriate technology choices for our styles of work.

I think you are on to something with the SMP idea. You should  
research how many CPUs and how much memory you can pack into a single  
chassis along with a good performing storage subsystem.

My $.02 of course !


On Sep 1, 2007, at 3:56 PM, Marcos de Carvalho wrote:

> Hi list,
> I am in charge of the setup of a new bioinformatics lab in my  
> university. However, I am wondering what would be the best for my  
> current budget for the high-performance machine, that will do  
> basically genome assemblies, gene finding and homology searches for  
> at least 3 metagenomics projects (but it is probably that it will  
> be used for other applications, like molecular dynamics). For this  
> machine specifically I have about 23 000 US$ and my first though  
> was to build a cluster (which at current local prices could give me  
> a 32 node diskless beowulf with E4400 chips or 16 nodes with Q6600  
> chips). However, I saw some pretty good SMP machines (with a max of  
> 8 dual core opterons), for about the same price.
>  Taking off the fun of building the cluster, the relative easier  
> administration of a SMP machine and their more general purpose  
> application could justify their choose over a cluster? Even with 48  
> more cores, could the network be a serious bottleneck in comparison  
> with the SMP machine? Does gigabit port trunking could be a  
> solution for the network bottleneck?
>  I am not in a hurry for this machine, so the time of building the  
> cluster can be discarded.
> Thanks in advance.
> Regards,
> Marcos
> _______________________________________________
> Bioclusters maillist  -  Bioclusters at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters

More information about the Bioclusters mailing list