[Bioclusters] SGE and checkpointing
Ivo Grosse
bioclusters@bioinformatics.org
Wed, 15 May 2002 13:49:03 -0400
Hi,
- does SGE support checkpointing? How?
- if yes, is SGE capable of suspending low-priority jobs temporarily,
when there are high priority jobs waiting in the queue?
Ivo
P.S.
I mean: in the standard implementation of SGE that we currently use,
SGE is able to rearrange the jobs in the queue by priority. That is,
if you submit a low-priority job first, and I submit a high-priority
second, and your job has not yet started, then my job will be executed
first. However, if your job has already started, then (in the
implementation of SGE that we currently use) my job will have to wait
till your job is done. Can this be changed?
We would like to have the following solution:
User X submits 80 low-priority jobs, and each of them will run for 2
weeks. Since the queue is currently empty, all of the jobs get
started. Now user Y wants to submit just one high-priority job, which
will run for only 1 day. Unfortunately, in the current implementation
of SGE, the job of user Y would have to wait for 2 weeks. What we
would like is that one of the low-priority jobs of X gets temporarily
suspended, the high-priority job of user Y gets started, and after the
job of Y is finished, the suspended job of user X is continued. Is
that possible? How?