[Bioclusters] error on qsub/mpirun jobs

Michael Edwards miedward at gmail.com
Mon Sep 8 11:35:49 EDT 2008


On Fri, Sep 5, 2008 at 4:55 PM, Zhiliang Hu <zhu at iastate.edu> wrote:

>
> ----------------------------------------------------------
> Unable to copy file /var/spool/torque/spool/658.nagrp2..ER to
> hu at hist:/raid/pub/ncbi/blast/www/mpiblast.tmp
> >>> error from copy
> Host key verification failed.
> lost connection
> >>> end error output
> Output retained on that host in:
> /var/spool/torque/undelivered/658.nagrp2..ER
> ----------------------------------------------------------
>
> Note: When manually check, the "retained" file is not there:
> "/var/spool/torque/undelivered/658.nagrp2..ER"
>

qsub opens a shell to the selected compute node and runs the script from
that node.  So the retained file would not be on the head node, but on the
local file system on which the node was trying to run.

If you are trying to run qsub, the code has to be present on all the compute
nodes in the same place.  The easiest way to do this is using a shared file
system.


More information about the Bioclusters mailing list