[Bioclusters] Reasonable blast runtime

Lucas Carey lcarey at odd.bio.sunysb.edu
Fri Feb 3 16:16:01 EST 2006


Hi Paul,

I notice you have no output file here. I'm not sure how mpich handles buffering of stdout, but that might be hurting you. Try running with '-o /path/to/outputfile'. 
There might be a temp file to which output is being written. Running 'lsof' on the rank 1 process can tell you this. 
 
Output can often be rate-limiting with mpiblast. Do you expect to get a lot of hits? If so, you may want to limit the size of the output file via limiting the number of results returned per query (blastall -b and/or -v) or, more usefull, e-value (-e).
What percentage of the CPU is rank 1 using compared to ranks 2-13?

-Lucas

On Friday, February 03, 2006 at 11:07 -0800, Paul Mc Kenna wrote:
> We are trying to get the hang of a small cluster we just built. A 
> co-worker launched a 10K query against the whole human genome. It has 
> been running for 3 days now! He had previously launched a 1K query which 
> something else likely to be going on. Is there anyway to check on 
> exactly how much of a Blast job has been completed.
> 
> Process currently running on 6 compute nodes, 4 are dual processor 
> boards. Most have have 2G or more of memory. A quick look of the stats 
> showed no more than 12% of available memory being used, less on most 
> machines. Essentially no swap has been used to the best of my knowledge.
> 
> We used the following syntax to launch the job:
> 
> P4_GLOBMEMSIZE=268435456 time  /opt/mpich/gnu/bin/mpirun -np 14 
> /usr/local/bin/mpiblast -p blastn -d whole_genome.fa -i /home/kieran/rnd.seq
> 
> 
> Thanks
> 
> 
> Paul
> _______________________________________________
> Bioclusters maillist  -  Bioclusters at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters


More information about the Bioclusters mailing list