> I'd love to find some sort of 'bioclustering for dummies' that > outlines the usual solutions and approaches, also on the software side > something that describes the fundamentals of writing perl and java to > exploit clusters and even some simple examples/test packages that I > could play with to get my feet wet. Unfortunately there isn't as simple a thing as a "bioclusters for dummies", else consultants would be out of business and mailing lists would be dead ;) My side of the expertise in this area is with bioinformatics pipelines, having worked with the ensembl pipeline and now having developed with my group our own flexible open source pipeline in perl. You are welcome to read the docs of BioPipe at www.biopipe.org. As a general note it is absolutely worthwhile having something like LSF for your load sharing. Even though BioPipe does a lot of job management and tracking we still rely on a load sharing software underneath it. We tend to use LSF because we could afford it. If you can afford it, it is by far the most robust and sophisticated load sharing software. Bear in mind that prices have been going down, and also that they (don't quote me) change depending on the weather and the LSF salesman usually.... if you can't get LSF, SGE (Sun Grid Engine) is a good alternative, and so is PBSPro. We have the wrappers for LSF and PBS for BioPipe. We never got around to writing one for SGE, but it's very straight-forward, just an extra module that issues the right commands... > ease of administration seems to be another pro for LSF which is a big > thing as we just want it to work, we dont really want to babysit this > stuff - what sort of sysadmin commitment is needed to make this work? LSF will save you great job management headaches, as long as the initial setup is done well. Sysadmin commitment will be shifted to more standard queries like installing programs,etc. Bear in mind that you need a brilliant sysadmin in the first month or so, when you are building the cluster, deciding how to spread the blast databases, optimising LSF,etc. BioPipe will save you the second layer of headaches, i.e. automating analysis workflows, reproducing them easily,etc. > any thoughts/experiences with using Xserve in the mix with other > platforms and Xserve vs intel solutions? We are currently experimenting with Xserves that we have on loan from Apple, and are incredibly pleased by some of the performance we get out. Hybrid clusters are never a major issue *as long as* you have a good sysadmin who can deal with endianness, file systems, and as long as you give the sysadmin a good picture of where the possible bottlenecks will be. Hope it helped, Elia ******************************** * http://www.fugu-sg.org/~elia * * tel: +65 6874 1467 * * mobile: +65 9030 7613 * * fax: +65 6779 1117 * ********************************