[Bioclusters] Re: MPI clustalw

Guy Coates bioclusters@bioinformatics.org
Sun, 9 May 2004 11:17:16 +0100 (BST)


> example web servers and services where you need rapid response for
> single, or small numbers of jobs.

We (well, the ensembl-ites) do run a small amount of mpi-clustalw. The
algorithm scales OK for small alignment (but they run quickly, so why
bother?) but is horrible for large alignments.

These are figures for an alignment of a set of  9658 sequences, running on
Dual 2.8GHz PIV  machines with gigabit.

Ncpus 	Runtime 	Efficiency
----  	------- 	-----------
2 	28:21:33	1
4   	19:49:05	0.72
8 	14:49:02	0.48
10  	14:09:41	0.4
16  	13:37:36	0.26
24  	13:00:30	0.18
32  	12:48:39	0.14
48  	12:48:39	0.09
64  	11:19:40	0.08
96  	11:30:09	0.05
128 	11:13:28	0.04

However, although the scaling is horrible, it does at least bring the
runtime down to something more manageable. MPI clustalw only gets run for
the alignments that the single CPU version chokes on. It may not be
pretty, but at least you do get an answer, eventually. Horses for courses
and all that.


>
> Guy/Tim - did you ever deploy that HMMer PVM cluster we talked about
> for the Pfam web site?
>

It's on the ever-expanding list of things to do. So, does anyone here have
any opinions/experience  on the PVM verison of HMMer?


Guy
-- 
Guy Coates,  Informatics System Group
The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
Tel: +44 (0)1223 834244 ex 7199