[Bioclusters] parallel blast???
Chris Dagdigian
bioclusters@bioinformatics.org
Mon, 16 Sep 2002 20:36:43 -0400
Be careful with your benchmarks as they can be meaningless or
misleading. You will find that the speed of blast distributed within a
cluster or compute farm is directly related to 2 things: (a) the amount
of physical memory in the compute nodes and (b) the speed of your
storage or disk I/O system.
You can have the fastest server on earth but if you searching with
blast against an NFS mounted database and your network or fileserver is
slow then your blast searching speeds will be horrible. Give me a small
number of speedy linux boxes and I can bring a $300,000 NFS/NAS system
to its knees. Storage does matter.
Blast performance also depends on you tune your DRM (gridengine or LSF
etc. etc.) and how you adjust your workflow with respect to splitting
large databases, locally caching data on compute nodes etc. etc.
What are you trying to benchmark for? Picking the right CPU? Some
people on this list may have already done this. My personal preference
is Intel Pentium III's right now because:
o P IV's are way too expensive
o P III's are dirt cheap
o There are a ton of dual-CPU motherboard options for the PIII allowing
me flexible choices of system packaging and vendor
o Athalon / AMDs are super fast but your motherboard choices are
limited and you need to be really careful about cooling and ventilation
--Chris
http://bioteam.net
On Monday, September 16, 2002, at 07:24 PM, Romualdo Zayas Lagunas
wrote:
> Hello everyone,
>
> I am part of a computational genomics team at CIFN-UNAM in Mexico.
> Currently, we are trying to purchase a cluster (Linux and 32 or 48
> dual nodes), but since we lack experience in the field we
> would like to perform some tests on some clusters first. Can you give
> me
> any pointers to URLs or any resources where I can download parallel
> blast or scripts that run blast in parallel (and its different command
> line options)?
>
> I will really appreciate any help you can give me.
>
>
> Thanks a lot in advance
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> M.C. Romualdo Zayas Lagunas
> CIFN-UNAM
> rzayas@cifn.unam.mx
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> _______________________________________________
> Bioclusters maillist - Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters