<html>

<body>

We're running two small IBM BladeCenter clusters under SuSE, with GPFS

for (we hope) fast file I/O. It seems to us that when user processes on a

blade are particularly memory intensive, and GPFS needs to compete for a

resource (memory in this case), GPFS most likely won't survive the

competition and will die. This may happen on one or more nodes of the

cluster. The GPFS daemon 'mmfsd' will lose its connection to other

members of the cluster and lose its GPFS filesystem mounts, and

consequently any services that reside on GPFS will fail. The blade will

not necessarily crash after that; it may stay afloat may even be

accessible via ssh.<br><br>

Have others encountered this situation? How can we prevent this behavior?

More generally, what kinds of limits do you impose on consumption of

resources such as memory and CPU? Thanks,<br><br>

Hershel<br><br>

<br>

<x-sigsep><p></x-sigsep>

<hr>

<font size=3>Hershel M. Safer, Ph.D.<br>

Chair, 5th European Conference on Computational Biology (ECCB '06)<br>

Head, Bioinformatics Core Facility<br>

Weizmann Institute of Science<br>

PO Box 26, Rehovot 76100, Israel<br>

tel: +972-8-934-3456 | fax: +972-8-934-6006<br>

e-mail:

<a href="mailto:hershel.safer@weizmann.ac.il">

hershel.safer@weizmann.ac.il</a> |

<a href="mailto:hsafer@alum.mit.edu">hsafer@alum.mit.edu</a><br>

url:

<a href="http://bioportal.weizmann.ac.il/" eudora="autourl">

http://bioportal.weizmann.ac.il<br><br>

</a>***************************************************<br>

Plan now for ECCB '06!<br>

5th European Conference on Computational Biology<br>

Eilat, Israel, Sept 10 -- 13, 2006<br>

Visit

<a href="http://www.eccb06.org/" eudora="autourl">www.eccb06.org</a> for

details </font></body>

</html>