Thanks Aaron. The program is called lamwipe but it doesn't help at all. I still run into these errors very very frequently. On the rare occasions when mpiblast works ,it takes an awful lot of time !! btw, Is mpich an alternative to lam-mpi ? -Cel. Aaron Darling <darling at cs.wisc.edu> wrote: Hi, based on the call stack it looks like LAM is crashing during initialization (MPI_Init()). I can't imagine why mpiblast would cause such behavior. Do other parallel applications exhibit similar behavior? Perhaps you need to clean your LAM environment after a previous crash? I think there's a program called lamclean or something. Maybe somebody else could comment--I'm mostly an mpich user. -Aaron Celeste vikram wrote: > Hi, > > Sometimes I get some errors while running mpiblast1.3...Any help would > be appreciated. > ---------------------------------------------------------------------------------------------------------------- > MPI_Recv: process in local group is dead (rank 1, MPI_COMM_WORLD) > Rank (1, MPI_COMM_WORLD): Call stack within LAM: > Rank (1, MPI_COMM_WORLD): - MPI_Recv() > Rank (1, MPI_COMM_WORLD): - MPI_Allreduce() > Rank (1, MPI_COMM_WORLD): - MPI_Comm_split() > Rank (1, MPI_COMM_WORLD): - MPI_Init() > Rank (1, MPI_COMM_WORLD): - main() > MPI_Recv: process in local group is dead (rank 2, MPI_COMM_WORLD) > Rank (2, MPI_COMM_WORLD): Call stack within LAM: > Rank (2, MPI_COMM_WORLD): - MPI_Recv() > Rank (2, MPI_COMM_WORLD): - MPI_Allreduce() > Rank (2, MPI_COMM_WORLD): - MPI_Comm_split() > Rank (2, MPI_COMM_WORLD): - MPI_Init() > Rank (2, MPI_COMM_WORLD): - main() > MPI_Recv: process in local group is dead (rank 3, MPI_COMM_WORLD) > Rank (3, MPI_COMM_WORLD): Call stack within LAM: > Rank (3, MPI_COMM_WORLD): - MPI_Recv() > Rank (3, MPI_COMM_WORLD): - MPI_Allreduce() > Rank (3, MPI_COMM_WORLD):&n bsp; - MPI_Comm_split() > Rank (3, MPI_COMM_WORLD): - MPI_Init() > Rank (3, MPI_COMM_WORLD): - main() > ----------------------------------------------------------------------------- > One of the processes started by mpirun has exited with a nonzero exit > code. This typically indicates that the process finished in error. > If your process did not finish in error, be sure to include a "return > 0" or "exit(0)" in your C code before exiting the application. > > PID 17770 failed on node n0 (40.191.22.1) due to signal 15. > ------------------------------------------------------------------------------------------------------ > > Thx, > Cel > > ------------------------------------------------------------------------ > Do you Yahoo!? > Yahoo! Search presents - Jib Jab's 'Second Term' > > > >------------------------------------------------------------------------ > >_______________________________________________ >Bioclusters maillist - Bioclusters at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bioclusters > > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://bioinformatics.org/pipermail/bioclusters/attachments/20050206/9f0fb849/attachment-0001.htm