[Bioclusters] Re: new on using clusters: problem running mpiblast
(2)
Zhiliang Hu
hu at animalgenome.org
Mon Oct 1 14:01:02 EDT 2007
Our cluster machine vender support thinks the torque launcher may not be
the problem. He ran it with a "debug" option and got the following:
> /opt/openmpi.gcc/bin/mpirun -np 3 --mca btl openib,self -machinefile
./machines /home/local/bin/mpiblast --debug -p blastp -i ./bait.fasta -d
ecoli.aa
[0] 0.00206017 Locking fragment list
[0] 0.00232911 Locked fragment list
[0] 0.00454402 broadcasting file size of 390
[0] 0.00466418 file size broadcasted
[0] 0.00472903 broadcasting file
[0] 0.00487208 file broadcasted
[0] 0.00493002 initializing ncbi ...blastall -p blastp -i
./bait.fasta -d /raid/pub/ncbi/blast/mpidb/ecoli.aa
[0] 0.00504208
(0) done initializing ncbi.
[0] 0.00791001 Init blast error code 0
[0] 0.0187011 First date: Aug 29, 2007 1:24 AM
[1] 0.034095 Temp name base: /tmp/bait.fastaXXXXXX
[1] 0.034435 Got temp name: /tmp/bait.fasta7ciIil
[1] 0.0344911 waiting for file size broadcast
[1] 0.034523 received file size broadcast of 390
[1] 0.03456 opening receive file /tmp/bait.fasta7ciIil
[1] 0.034647 receiving file to /tmp/bait.fasta7ciIil
[1] 0.0346761 received file broadcast
[1] 0.0348959 Query file received as /tmp/bait.fasta
[1] 0.0349259 Receiving query length adjustments and effective
db sizes from the master
[1] 0.0349801 1 adjustments and effective db sizes received
successfully
[1] 0.035022 First date: Aug 29, 2007 1:24 AM
[2] 0.045331 Temp name base: /tmp/bait.fastaXXXXXX
[2] 0.0456631 Got temp name: /tmp/bait.fastasGSwMl
[2] 0.0457201 waiting for file size broadcast
[2] 0.0457571 received file size broadcast of 390
[2] 0.045788 opening receive file /tmp/bait.fastasGSwMl
[2] 0.0458901 receiving file to /tmp/bait.fastasGSwMl
[2] 0.0459352 received file broadcast
[2] 0.046097 Query file received as /tmp/bait.fasta
[2] 0.0461261 Receiving query length adjustments and effective
db sizes from the master
[2] 0.04617 1 adjustments and effective db sizes received successfully
[2] 0.0462182 First date: Aug 29, 2007 1:24 AM
[2] 0.0462492 Temp name base: /tmp/ecoli.aaXXXXXX
[2] 0.0463071 Got temp name: /tmp/ecoli.aaxum8jY
[2] 0.0554571 initializing ncbi ...blastall -p blastp -i
/tmp/bait.fastasGSwMl -m 11 -o /dev/null -d /tmp/ecoli.aaxum8jY
[2] 0.05564
(2) done initializing ncbi.
[2] 0.0556982 Locking fragment list
[2] 0.0557191 Locked fragment list
[2] 0.0566981 Fragment list sent.
[2] 0.0567541 Idle message sent
[1] 0.0749409 Node 2 has no fragments
[1] 0.0749829 Received fragment list update from node 2
[1] 0.075444 Scheduler got tag 10 from 2 1 0.0755
Bailing out with signal 11
[node001:10599] MPI_ABORT invoked on rank 1 in communicator MPI_COMM_WORLD
with errorcode 0 0 0.0813701 Bailing out with signal 15
[node003:31592] MPI_ABORT invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 0 2 0.081321 Bailing out with signal 15
[node002:10480] MPI_ABORT invoked on rank 2 in communicator MPI_COMM_WORLD
with errorcode 0 2 0.081321 Bailing out with signal 15
More information about the Bioclusters
mailing list