[Bioclusters] Re: new on using clusters: problem running mpiblast
(2)
Zhiliang Hu
hu at animalgenome.org
Tue Sep 4 15:42:54 EDT 2007
Somehow I found replies to my post on "Bioclusters" list archive through
Google search but I didn't get them in my mail box. Anyway let me follow
up from the messages captured on web --
- I recompiled and made sure the "mpiblast" is located at a nsf shared
file path, and got the same errors.
- When I added "--debug " to the mpirun I got the same error, no extra
message.
- I did run a small c "hello" program from my colleagues which worked fine
(got responses from every node).
One of my colleagues is suspecting if mpiblast was compiled right with
OpenMPI, and I am looking at
https://wiki.rocksclusters.org/wiki/index.php/MPI-Blast_with_OpenMPI but
that's one for Rocks OS while I have CentOS 5; My vender is suggesting my
evionment setup might have some problems.... I have been checking
"everything" and still in mist ;-)
Please let me know if you may have more suggestions... Thanks in advance!
Zhiliang
> Hi Zhiliang,
>
> the command-line looks reasonable, does mpiblast generate any additional
> output when the --debug option is added to the command-line?
> Is the /usr/local/bin filesystem replicated on each node? i.e. does
> every node have a copy of mpiblast located at /usr/local/bin/mpiblast?
> I personally have not tested mpiBLAST with OpenMPI, although mpiBLAST
> doesn't do anything too fancy with MPI so it really ought to work.
>
> -Aaron
>
>
> Zhiliang Hu wrote:
>> I am new on using clusters.
>>
>> I have just installed mpiblast 1.4.0 with ncbi toolbox (June 2005)
>> from source codes on a linux cluster [x86_64/x86_64 (GNU/Linux), CentOS].
>> The installation seemed to be successful.
>>
>> Now when I try the following:
>>
>> > /opt/openmpi.gcc/bin/mpirun -np 14
>> /usr/local/bin/mpiblast -p blastn
>> -i /raid/pub/ncbi/blast/db/BTrsSNP
>> -d bta.genome.chr
>> -o out1
>> -e 0.0000000001
>> -W 38 -v 1 -b 1
>>
>> and immediately got following errors:
>> ----------------
>> MPI_ABORT invoked on rank 0 in communicator MPI_COMM_WORLD with
>> errorcode 0
>> mpirun noticed that job rank 1 with PID 28131 on node xxxx.xxxxxx.xxx
>> exited on signal 15 (Terminated).
>> 12 additional processes aborted (not shown)
>> --------------------------------
>>
>> Maybe I am missing something obvious? Could anyone point to the right
>> place for tracing the problem? ...
>>
>> Zhiliang
More information about the Bioclusters
mailing list