[Bioclusters] Versions of Blast that run on a cluster?

Joe Landman landman at scalableinformatics.com
Wed Jan 5 09:36:48 EST 2005


Daniel.G.Roberts at aventis.com wrote:

> Hello All
> Can anyone point me to example/FAQ resources on BLAST implemented on a 
> Linux Cluster?
>

Hi Dan:

  What sort of cluster?  Have you built it, or are you planning on 
building it?

  Have a look at 
http://downloads.scalableinformatics.com/downloads/ncbi/ for NCBI blast 
RPMs, though this will not run "across" the cluster (only on a single 
node).  Also have a look at 
http://downloads.scalableinformatics.com/downloads/mpiblast/ for 
mpiblast (http://mpiblast.lanl.gov) RPMs, for a blast which will run 
across the cluster, and our tool  
http://downloads.scalableinformatics.com/downloads/run_mpiblast  and 
http://downloads.scalableinformatics.com/downloads/run_mpiblastrc to 
drive mpiblast in a multiuser environment via GridEngine.

> I have been asked to install BLAST on our cluster, but I don't know 
> much about BLAST.
> What BLAST versions are run in a 64 and 32 bit env?
>

Depends upon the processor.  We have binaries and source RPMs for AMD64, 
ia32 based systems.  We might be able to generate some Itanium2 RPMs if 
needed.


> In general what type of compute nodes are required for BLAST to run 
> effectively?  I/O Intensive nodes?
>

It sounds like we should have a longer discussion, possibly offline.

BLAST likes memory.  The more memory the better.  Depending upon the 
size of your databases that you plan to search against, you might need 
to consider segmenting the database.  BLAST uses mmap for index input, 
so if you have to read/reread the database indices over and over, you 
will want pretty good io bandwidth (even more so if you use a huge 
database without segmentation).

> Anyone runiing BLAST using the Torque Queue and Maui Scheduler?
>

Having had a long and direct experience with Torque's predecessors, I 
would (strongly) advise you to look at alternatives such as GridEngine.  
You can use Torque, but I would not advise it, unless you have no other 
choice.

> Any and all help is grealty appreciated!
>

disclosure:  my company sells clusters/support/integration services for 
exactly these cases.

> Thanks
> Dan
>

Joe

landman at scalableinformatics.com
http://www.scalableinformatics.com


More information about the Bioclusters mailing list