[Bioclusters] a dedicated cluster to mpiblast the nr database

Aaron Darling bioclusters@bioinformatics.org
Thu, 4 Dec 2003 18:05:45 -0600 (CST)


Because mpiBLAST uses NCBI's implementation of BLAST (in the NCBI
Toolbox), its speedup on any given processor architecture should be
proportional to the speedup seen when running standard NCBI BLAST on that
architecture.
Thus, if BLAST runs 1.5 times faster on your {opteron,G5,whatever},
mpiBLAST should also run 1.5 times faster.

Having adequate RAM to cache the database minimizes disk I/O, a
significant bottleneck for large databases.  I don't know what the
performance hit is, if any, for using memory beyond 32-bit address space
under linux on a 32-bit processor.  Does anybody else have an idea?

-Aaron

On Thu, 4 Dec 2003, Mike Cariaso wrote:

> I'm looking to put together a small cluster for a very
> large number of blasts against a local copy of the the
> ncbi nr database.
>
> I am under the impression the best thing I can do is
> to get it all in memory, for which I'm estimating 24GB
> should cover me.
>
> I've found a vendor for 3 dual 2.4G Xenon each with
> 8GB (ram: DDR 8x1G ecc pc2100). The total price is
> about $12k for the 3 of them.
>
> would mpiblast benefit from a 64 bit platform, or will
> 32 bits perform reasonably close.
>
> I hope to just add 1 more machine in the future as nr
> grows.
>
> Given this config what might be my bottleneck?
>
> Any advice, experience, or warnings would be greatly appreciated.
>
> =====
> Mike Cariaso
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
>