Hi Glen Thanks - I agree - Rocks looks great - but I agreed with the users not to consider a non-Debian-based solution unless it simply will not work any other way... that'll teach me to make pledges with users. John _____ From: bioclusters-bounces at bioinformatics.org [mailto:bioclusters-bounces at bioinformatics.org] On Behalf Of Glen Otero Sent: Thursday, January 20, 2005 6:56 PM To: Clustering, compute farming & distributed computing in life science informatics Subject: Re: [Bioclusters] Newbie question: simple low-admin non-threaded Debian-based cluster solution? Check out Rocks (http://www.rocksclusters.org). IMHO it is much better than FAI and SIS. It also includes SGE. On Jan 20, 2005, at 3:47 PM, Speakman, John H./Epidemiology-Biostatistics wrote: Hello If anyone can review the below and suggest a way to go, or even better something I have gotten completely wrong, it would be much appreciated! Thanks John Hardware: Ten HP Proliant nodes, one DL380 and nine DL140. Each node has two 3.2Ghz Xeon processors. They do not have a dedicated switch; the infrastructure folks say they want to implement this using a VLAN. We have some performance concerns here but have agreed to give it a try. User characteristics: The users are biostatisticians who typically program in R; they often use plug-in R modules like bioconductor. They always want the newest version of R right away. Also they may also write programs in C or Fortran. Data files are usually small. Nothing fancy like BLAST, etc. User concerns: Users require a Linux clustering environment which enables them to interact with the cluster as though it were a single system (via ssh or X) but which will distribute compute-intensive jobs across nodes. As the code is by and large not multithreaded, it is expected that each job will be farmed out to an idle compute node and probably stay there until it is done. That's fine. In other words, to use all twenty CPUs we will need twenty concurrent jobs. Administration concerns: The cluster must require the absolute minimum of configuration and maintenance, because I've got to do it and I'm hardly ever around these days. Other concerns: Users and administrators alike have a preference for Debian Linux over other distributions. Users also have an aversion to non-free software. Either or both of these considerations could be overridden if the reasons were pressing. Cluster software requirements: (1) The cluster must have a mean of deploying Linux to the nodes and keeping their configurations (including updates to the operating system and applications, lists of users, printers, etc.) in synchronization. (2) The cluster must have a means of transparently distributing jobs to idle CPUs. It's not necessarily to actively rebalance this when a job has started - it's okay if, once tied to a node, it stays there. Potential solutions: We like the look of NPACI Rocks but its non-Debian-ness makes it a last resort only. What we would really like to try is a Debian version of NPACI Rocks; in its absence we will probably have to use two separate packages to fulfil the requirements of #1 and #2 above. Sensible options for #1 seem to be: (1) SystemImager (www.systemimager.org) (2) FAI (http://www.informatik.uni-koeln.de/fai/), maybe also involving the use of cfengine2 (http://www.iu.hio.no/cfengine/) SystemImager is the better-established product and looks to be simpler to set up than FAI and/or cfengine2, in both of which the learning curve looks steep. However, FAI seems more elegant and more like the idea of "NPACI Rocks Debian" that we're looking for, implying that once set up FAI/cfengine2 will require less ongoing maintenance. Sensible options for #2 seem to be: (1) OpenMosix (2) OpenPBS (3) Sun GridEngine N1 Note: all of the above have commercial versions; we'd be reluctant to consider them unless it means big savings in administration time and effort. We get the impression OpenMosix (and, to a lesser extent, OpenPBS) have question marks over how much time and resources the people maintaining these products have, suggesting bugs, instability and not keeping up with kernel/library updates, etc. Sun GridEngine seems more robust but does not seem to have a big Debian user base. What do you all should we try first? Thanks! John John Speakman Manager, Clinical Research Systems Memorial Sloan-Kettering Cancer Center 307 East 63rd Street, New York NY 10021 USA +1 646 735 8187 - SpeakmaJ at mskcc.org ===================================================================== Please note that this e-mail and any files transmitted with it may be privileged, confidential, and protected from disclosure under applicable law. If the reader of this message is not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying, or other use of this communication or any of its attachments is strictly prohibited. If you have received this communication in error, please notify the sender immediately by replying to this message and deleting this message, any attachments, and all copies and backups from your computer. _______________________________________________ Bioclusters maillist - Bioclusters at bioinformatics.org https://bioinformatics.org/mailman/listinfo/bioclusters Glen Otero Ph.D. Linux Prophet -------------- next part -------------- An HTML attachment was scrubbed... URL: http://bioinformatics.org/pipermail/bioclusters/attachments/20050120/017db84f/attachment-0001.htm