> Respectfully, and at the risk of sounding, ridiculously naive -- > why not > consider upgrading the I/O switching technology to Myrinet or > Infiniband for > higher-bandwidth and ultra-low latency, before buying more servers? A good question, but it addresses the wrong layer in this particular conversation. The network filesystem rests on top of the network fabric. The problems in this case are things like the number concurrent accesses allowed through a single server, and thrashing at the disk level. Even given a perfect network (zero latency, infinite bandwidth) we would still need to have a conversation about the scalability of the file server. In fact, we would need to have that conversation a lot sooner. Other respondents have hit it on the head: There are two basic approaches: Modify the algorithm or beef up the server. Algorithm mods include copying or staging files to local space first, operating on disks local to the nodes as much as possible, batching I/O rather than the classic "open FILEHANDLE; do_everything; close FILEHANDLE" approach. For beefing up the fileserver, we've had good results with Apple's XSAN, price perfomance wise. It's common knowledge that an enterprise scale file server can easily cost more than the cluster it's supposed to serve. -Chris Dwan The BioTeam > Quoting Juan Carlos Perin <bic at genome.chop.edu>: > > >> >> I had a question to see if anyone had any knowledge of a problem >> we've >> been encountering. It seems our Apple cluster is crashing due to >> NFS. >> When we run large batch jobs that frequently access an NFS mount, the >> system ends up accumulating 'stuck' processes. If the job is >> able to >> finish it eventually cleans the 'stuck' processes, and all is well. >> But, if the job continues to allow accumulation of these stuck >> processes, if a given job runs long enough, the system slowly >> deteriorates and becomes less and less responsive, eventually >> freezing >> up and not allowing anything to function at all. >> >> We started the maximum number of NFS servers (20) and this improved >> things, but didn't fix them. We also limited the jobs to 10 nodes >> (20 >> processors) to theoretically allow one node to access one NFS >> pipeline >> at any given time. I'm not sure if anyone has run into this >> before, or >> if anyone has ideas on how to approach fixing this problem. The only >> errors we're seeing otherwise are in the system log, complaining >> about >> PasswordService not matching the clients response. >> >> We're still running OSX 10.3.8 and our jobs are running through SGE >> 5.3. And we've got a 16 node (32 processor G5 system) with at >> least 2gb >> RAM per node. The programs running are a mixture of text mining >> algorithms in both Perl and Java. Both requiring frequent reads on >> large .txt files residing on NFS shared directories. >> >> Thanks in advance, for any ideas or suggestions. >> >> Juan Perin >> Children's Hospital of Philadelphia >> _______________________________________________ >> Bioclusters maillist - Bioclusters at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/bioclusters