[Bioclusters] SGE and preemption
Ivo Grosse
bioclusters@bioinformatics.org
Wed, 15 May 2002 15:16:40 -0400
Hi again,
this is a follow-up on my last SGE question (Subject: SGE and
checkpointing). I did some naive web search, and I think the feature
we want is termed "preemption." Is that correct?
http://www.supercluster.org/documentation/maui/8.4preemption.html
Maui is a scheduler that has that preemption feature, and I read that
SGE can be combined with Maui. Is that correct?
http://www.supercluster.org/documentation/maui/sgeintegration.html
Of course, if SGE (without Maui) could do preemption, that would be
great. Can it do that? How?
Best regards, Ivo
+++
From: Ivo Grosse <grosse@cshl.org>
Organization: Cold Spring Harbor Laboratory
To: bioclusters@bioinformatics.org
Subject: [Bioclusters] SGE and checkpointing
Hi,
- does SGE support checkpointing? How?
- if yes, is SGE capable of suspending low-priority jobs temporarily,
when there are high priority jobs waiting in the queue?
Ivo
P.S.
I mean: in the standard implementation of SGE that we currently use,
SGE is able to rearrange the jobs in the queue by priority. That is,
if you submit a low-priority job first, and I submit a high-priority
second, and your job has not yet started, then my job will be executed
first. However, if your job has already started, then (in the
implementation of SGE that we currently use) my job will have to wait
till your job is done. Can this be changed?
We would like to have the following solution:
User X submits 80 low-priority jobs, and each of them will run for 2
weeks. Since the queue is currently empty, all of the jobs get
started. Now user Y wants to submit just one high-priority job, which
will run for only 1 day. Unfortunately, in the current implementation
of SGE, the job of user Y would have to wait for 2 weeks. What we
would like is that one of the low-priority jobs of X gets temporarily
suspended, the high-priority job of user Y gets started, and after the
job of Y is finished, the suspended job of user X is continued. Is
that possible? How?