Hi Grace, some further thoughts: > A popular method for running jobs on two different machines at once is > to divide the input into parts, send each part to a different machine, run > to program on each machine using the segment of the input on that machine, The most optimal way is to split the input (i.e. your sequence to be analysed) but not the blast database. In other words, if possible, have as much local storage as possible for the actual blast databases, so that you don't have to split them, and all you split is the (small) sequences to be analysed. > Combining parallel (embarrassingly parallel) job execution with > scheduling/load-balancing features of DRM tools is really the key to > achieving the efficiency in a cluster that makes if a valuable > resource for doing things like BLAST. It's exactly for this integration that we've been developing BioPipe (www.biopipe.org) to make use of a choesn Load Sharing Softwares, as well as commonly used software such as bioperl, ensembl, biosql,etc. and manage a bioinformatics workflow through a combination of these in a parallel, load-balances fashion. You might want to take a look at it, we are currently using it for genome annotation and it serves us well. Of course it is open source and in continuous development, feel free to try it and shout at us ;) Elia ******************************** * http://www.fugu-sg.org/~elia * * tel: +65 6874 1467 * * mobile: +65 9030 7613 * * fax: +65 6779 1117 * ********************************