[Bioclusters] OpenPBS problems

Ron Chen bioclusters@bioinformatics.org
Tue, 2 Dec 2003 05:07:19 -0800 (PST)

--- Ami Klein <ami@genome.kvl.dk> wrote:
> when submitting jobs with no delay (or less than 3 
> seconds delay) between each submission the system 
> become unstable: sometimes pbs_sched crash with out 
> any note in it's log file.
> Should we switch to the PBSPro version?

This is a known problem:

The easiest workaround is to put a sleep in qsub, so
that users are forced to wait for 3 or more seconds
before another job submission.

Before you switch to PBSPro, which cost $$, you should
try Scalable PBS, which has *lots* of problem fixed.


Another batch system that I like is Gridengine:

It has more advanced scheduling algorithms, and better
features. One of the examples is "job array", which is
actually a group of jobs with the same job id (but
different task id), is actually easier for managing
genome type of workload.

**HOWEVER**, switching to a new batch may require
learning a new set of concepts.

Both SPBS and Gridengine are free and opensource,
download and try them.

> stdout and stderr files goes to the $HOME directory
> (and not to the 
> ~/.pbs_spool).

Don't know about this problem :(

> We also need a "swap out Queue", hence low priority
> queue that will suspend running jobs (and swap them 
> out to the disk) incase some jobs in other queues 
> needs it's cpu. Does such feature exist under PBS
> system?

The closest thing you can use is checkpointing. I
don't think a batch system can tell the OS to "swap
processes to disk".

Default PBS may not allow you to do that, may be you
need to use Maui+PBS. In SGE, you can use "subordinate


> Thanks.
> -- 
> Ami Klein,
> System Administrator
> Division of Genetics
> Institute of Animal Science and Animal Health
> The Royal Veterinary and Agricultural University
> Email: ami@genome.kvl.dk
> _______________________________________________
> Bioclusters maillist  - 
> Bioclusters@bioinformatics.org

Do you Yahoo!?
Free Pop-Up Blocker - Get it now