[Bioclusters] Re: new on using clusters: problem running mpiblast (2)

Zhiliang Hu hu at animalgenome.org
Mon Oct 1 14:01:02 EDT 2007


Our cluster machine vender support thinks the torque launcher may not be 
the problem.  He ran it with a "debug" option and got the following:

> /opt/openmpi.gcc/bin/mpirun -np 3 --mca btl openib,self -machinefile 
./machines /home/local/bin/mpiblast --debug -p blastp -i ./bait.fasta -d 
ecoli.aa

[0]     0.00206017      Locking fragment list
[0]     0.00232911      Locked fragment list
[0]     0.00454402      broadcasting file size of 390
[0]     0.00466418      file size broadcasted
[0]     0.00472903      broadcasting file
[0]     0.00487208      file broadcasted
[0]     0.00493002      initializing ncbi ...blastall -p blastp -i 
./bait.fasta -d /raid/pub/ncbi/blast/mpidb/ecoli.aa
[0]     0.00504208
(0) done initializing ncbi.
[0]     0.00791001      Init blast error code 0
[0]     0.0187011       First date: Aug 29, 2007  1:24 AM
[1]     0.034095        Temp name base: /tmp/bait.fastaXXXXXX
[1]     0.034435        Got temp name: /tmp/bait.fasta7ciIil
[1]     0.0344911       waiting for file size broadcast
[1]     0.034523        received file size broadcast of 390
[1]     0.03456 opening receive file /tmp/bait.fasta7ciIil
[1]     0.034647        receiving file to /tmp/bait.fasta7ciIil
[1]     0.0346761       received file broadcast
[1]     0.0348959       Query file received as /tmp/bait.fasta
[1]     0.0349259       Receiving query length adjustments and effective 
db sizes from the master
[1]     0.0349801       1 adjustments and effective db sizes received 
successfully
[1]     0.035022        First date: Aug 29, 2007  1:24 AM
[2]     0.045331        Temp name base: /tmp/bait.fastaXXXXXX
[2]     0.0456631       Got temp name: /tmp/bait.fastasGSwMl
[2]     0.0457201       waiting for file size broadcast
[2]     0.0457571       received file size broadcast of 390
[2]     0.045788        opening receive file /tmp/bait.fastasGSwMl
[2]     0.0458901       receiving file to /tmp/bait.fastasGSwMl
[2]     0.0459352       received file broadcast
[2]     0.046097        Query file received as /tmp/bait.fasta
[2]     0.0461261       Receiving query length adjustments and effective
db sizes from the master
[2]     0.04617 1 adjustments and effective db sizes received successfully
[2]     0.0462182       First date: Aug 29, 2007  1:24 AM
[2]     0.0462492       Temp name base: /tmp/ecoli.aaXXXXXX

[2]     0.0463071       Got temp name: /tmp/ecoli.aaxum8jY
[2]     0.0554571       initializing ncbi ...blastall -p blastp -i 
/tmp/bait.fastasGSwMl -m 11 -o /dev/null -d /tmp/ecoli.aaxum8jY
[2]     0.05564
(2) done initializing ncbi.
[2]     0.0556982       Locking fragment list
[2]     0.0557191       Locked fragment list
[2]     0.0566981       Fragment list sent.
[2]     0.0567541       Idle message sent
[1]     0.0749409       Node 2 has no fragments
[1]     0.0749829       Received fragment list update from node 2
[1]     0.075444        Scheduler got tag 10 from 2   1    0.0755 
Bailing out with signal 11
[node001:10599] MPI_ABORT invoked on rank 1 in communicator MPI_COMM_WORLD 
with errorcode 0	0       0.0813701       Bailing out with signal 15
[node003:31592] MPI_ABORT invoked on rank 0 in communicator MPI_COMM_WORLD 
with errorcode 0	2       0.081321        Bailing out with signal 15
[node002:10480] MPI_ABORT invoked on rank 2 in communicator MPI_COMM_WORLD 
with errorcode 0        2       0.081321        Bailing out with signal 15



More information about the Bioclusters mailing list