[Bioclusters] limiting job memory size on G5s

Chris Dagdigian dag at sonsorol.org
Wed Dec 15 16:47:48 EST 2004


Hi Barry,

This is probably a Grid Engine issue; enforcing memory usage limits for 
the purpose of sending suspend/terminate signals etc. has long been part 
of the SGE feature set.

You may want to post this message to the Grid Engine users mailing list: 
  users at gridengine.sunsource.net to see if anyone else is doing memory 
limit enforcement on Mac OS X with SGE 6

I did a brief search through the open bug reports but did not see any 
open issues that match your problem.

That said though, I've only seen jobs trip SGE queues into error state 
'E' when the job itself failed in a spectacular manner. It's odd that it 
errors out several queues and the runs on a different box.

If you are willing to share your test code I'd be interested in trying 
to replicate on one of the G5 clusters I have access to.

Regards,
Chris



Barry J Mcinnes wrote:

> We have a test cluster of G5s running 10.3.6 and GE 6.
> We want to be able to limit the physical memory usage, that works on SGE 
> under Solaris.
> The current limits for the queue are set via qmon, and are printed out 
> for each job as
> 
> cputime         unlimited
> filesize        unlimited
> datasize        2097152 kbytes
> stacksize       65536 kbytes
> coredumpsize    0 kbytes
> memoryuse       1048576 kbytes
> descriptors     1024
> memorylocked    unlimited
> maxproc         100
> 
> When we run our memory allocating test job, grabbing n times 4 bytes, 
> the following
> happens, 2, 2.5, 2.75 x 10**8 all run fine, for
> 300x10**8 GE tries running the job on a node then puts the node in E state,
> tries another node, puts it in E state, then actually runs on a third node.
> All G5s are identical hardware. By my calculations the 2.75x4x10**8 
> should fail.
> 
> Anyway is there a way to physically limit a job to use only 1GB of 
> memory and the job fail
> with a error without putting the node in Error and locking the node for 
> other jobs ?
> 

-- 
Chris Dagdigian, <dag at sonsorol.org>
BioTeam  - Independent life science IT & informatics consulting
Office: 617-665-6088, Mobile: 617-877-5498, Fax: 425-699-0193
PGP KeyID: 83D4310E iChat/AIM: bioteamdag  Web: http://bioteam.net


More information about the Bioclusters mailing list