<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=us-ascii" http-equiv="Content-Type">

  <title></title>

</head>

<body bgcolor="#ffffff" text="#000000">

<tt>Which of these PVFS2 Lustre GPFS have some level of redundancy ?<br>

<br>

</tt>

<pre class="moz-signature" cols="72">===============================================

David Coornaert    (<a class="moz-txt-link-abbreviated" href="mailto:dcoorna@dbm.ulb.ac.be">dcoorna@dbm.ulb.ac.be</a>)

Belgian Embnet Node (<a class="moz-txt-link-freetext" href="http://www.be.embnet.org">http://www.be.embnet.org</a>)

Universit&eacute; Libre de Bruxelles

Laboratoire de Bioinformatique

12, Rue des Professeurs Jeener &amp; Brachet

6041  Gosselies

BELGIQUE

T&eacute;l:  +3226509975

Fax:  +3226509998

===============================================

</pre>

<br>

<br>

DGS wrote:

<blockquote cite="mid20050829044741.GC23569@goldfinger.local"

 type="cite">

  <blockquote type="cite">

    <pre wrap="">Now that we have all 63 up and running it looks like we are

getting performance issues with NFS much in the same way

that others have reported here. Even moderate job loads

produce trouble - (nfsstats -c show lots of retransmissions),

    </pre>

  </blockquote>

  <pre wrap=""><!---->

Are you using NFS over TCP?  If not, you probably should.  That

introduces some reliability problems, in that NFS/TCP is no

longer stateless.  If the file server goes down, clients may

hang.  But since your file server is your head node, it's mostly

a moot point.  Lose the head node, and you lose the cluster

anyway.

  </pre>

  <blockquote type="cite">

    <pre wrap="">grid engine execds don't report back in so qhost shows nodes not

responding though eventually they will return. On occasion one of

the switches stops and that whole "side" of the cluster disappears.

so we reboot the switch and are back in action. Anyway here are my

questions (thanks for your patience in reading through this)

Has anyone had similar problems with these SMC switches ?

I'm not accustomed to having the switches die like this.

In terms of improving NFS performance I've already

put SGE spool onto the local nodes to try to improve things

but only helps a little. There are various NFS tuning

documents with respect to clusters ( using tcp, atime, rsize,

wsize, etc options to mount). I've experimented with a few of

these (rsize, wsize) though with only very marginal positive impact.

for those with larger clusters and similar issues have you found

a subset of these options to be more key or influential than others ?

    </pre>

  </blockquote>

  <pre wrap=""><!---->

If you use NFS/TCP, the "rsize" and "wsize" parameters are 

irrelevant.  The Linux NFS how-to suggest raising the 'sysctl' 

values of "net.core.rmem_max" and "net.core.rmem_default" higher

than their usual values of 64k.  You should also pay attention

to the number of 'nfsd' processes running on your server.  The

rule of thumb is eight per CPU.  In principle, the more clients

you have the more 'nfsd' processes you want.  But multiple server

processes contend for resources themselves, so you reach a point

of diminishing returns in starting more. 

  </pre>

  <blockquote type="cite">

    <pre wrap="">One scenario that has been discussed is bonding two NICs

on the v40z in conjunction with switch trunking. Does anyone

have any opinions or ideas on this ? 

    </pre>

  </blockquote>

  <pre wrap=""><!---->

If your switch can trunk, go ahead.  I trunk together gigabit 

ethernet interfaces on a FreeBSD file server.  I've some rumours

to the effect that a four-way trunk on Linux can be slower than

a two-way, due to problems in the bonding driver.  Regard that

as just hearsay, however, because I don't have any experience

with such things on Linux.  You might consider using jumbo

frames, if your switches support that.

  </pre>

  <blockquote type="cite">

    <pre wrap="">Lastly is it even worth

it to keep messing with NFS ? And maybe go for GFS.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

There are a number of parallel or cluster file systems in 

addition to GFS, like PVFS2 (free), Lustre (sort of free),

GPFS (free to universities), TeraFS (commercial), and Ibrix

(commercial).  They may not work well for hosting home

directories, because they're not optimized for that sort

of I/O load.  They're also, in my experience, rather less

than stable.  We built a fifty node cluster with just GPFS,

no NFS and very little local disk.  The results were quite

disappointing.

File I/O is one of the major un-solved problems of cluster

computing.  Anybody who tells you otherwise is trying to

sell you something.

David S.

  </pre>

  <blockquote type="cite">

    <pre wrap="">

    </pre>

  </blockquote>

  <pre wrap=""><!---->_______________________________________________

Bioclusters maillist  -  <a class="moz-txt-link-abbreviated" href="mailto:Bioclusters@bioinformatics.org">Bioclusters@bioinformatics.org</a>

<a class="moz-txt-link-freetext" href="https://bioinformatics.org/mailman/listinfo/bioclusters">https://bioinformatics.org/mailman/listinfo/bioclusters</a>

  </pre>

</blockquote>

</body>

</html>