[Bioclusters] Re:Advice on getting started with clustering, LSF, Xserve

Simon Twigger bioclusters@bioinformatics.org
Mon, 11 Nov 2002 09:49:37 -0600


Hi Elia,

Many thanks for your thoughts on all of this - i agree its a little  
optimistic to hope for a 'dummies' guide but your overview was very  
useful and I just need somewhere to start. I'm hoping to chat to the  
LSF people this week to get some idea of how that might work and the  
pricing information I get from them will be interesting to see in light  
of recent posts to the list.

How did you go about getting Xserves on load/demo from Apple? If I can  
get a few to try before I buy then that would be great.

Many thanks,

Simon.


On Wednesday, Nov 6, 2002, at 19:36 America/Chicago, Elia Stupka wrote:

>> I'd love to find some sort of 'bioclustering for dummies' that
>> outlines the usual solutions and approaches, also on the software side
>> something that describes the fundamentals of writing perl and java to
>> exploit clusters and even some simple examples/test packages that I
>> could play with to get my feet wet.
>
> Unfortunately there isn't as simple a thing as a "bioclusters for
> dummies", else consultants would be out of business and mailing lists
> would be dead ;)
>
> My side of the expertise in this area is with bioinformatics pipelines,
> having worked with the ensembl pipeline and now having developed with  
> my
> group our own flexible open source pipeline in perl. You are welcome to
> read the docs of BioPipe at www.biopipe.org. As a general note it is
> absolutely worthwhile having something like LSF for your load
> sharing. Even though BioPipe does a lot of job management and tracking  
> we
> still rely on a load sharing software underneath it. We tend to use LSF
> because we could afford it. If you can afford it, it is by far the most
> robust and sophisticated load sharing software. Bear in mind that  
> prices
> have been going down, and also that they (don't quote me) change  
> depending
> on the weather and the LSF salesman usually.... if you can't get LSF,  
> SGE
> (Sun Grid Engine) is a good alternative, and so is PBSPro. We have the
> wrappers for LSF and PBS for BioPipe. We never got around to writing  
> one
> for SGE, but it's very straight-forward, just an extra module that  
> issues
> the right commands...
>
>> ease of administration seems to be another pro for LSF which is a big
>> thing as we just want it to work, we dont really want to babysit this
>> stuff - what sort of sysadmin commitment is needed to make this work?
>
> LSF will save you great job management headaches, as long as the  
> initial
> setup is done well. Sysadmin commitment will be shifted to more  
> standard
> queries like installing programs,etc. Bear in mind that you need a
> brilliant sysadmin in the first month or so, when you are building the
> cluster, deciding how to spread the blast databases, optimising  
> LSF,etc.
>
> BioPipe will save you the second layer of headaches, i.e. automating
> analysis workflows, reproducing them easily,etc.
>
>> any thoughts/experiences with using Xserve in the mix with other
>> platforms and Xserve vs intel solutions?
>
> We are currently experimenting with Xserves that we have on loan from
> Apple, and are incredibly pleased by some of the performance we get
> out. Hybrid clusters are never a major issue *as long as* you have a  
> good
> sysadmin who can deal with endianness, file systems, and as long as you
> give the sysadmin a good picture of where the possible bottlenecks will
> be.
>
> Hope it helped,
>
> Elia
>
> ********************************
> * http://www.fugu-sg.org/~elia *
> * tel:    +65 6874 1467        *
> * mobile: +65 9030 7613        *
> * fax:    +65 6779 1117        *
> ********************************
>
>
>
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
>
>
------------------------------------------------------------------------ 
--------------------------
Simon Twigger, Ph.D.
Assistant Professor, Bioinformatics Research Center

Medical College of Wisconsin
8701 Watertown Plank Road,
Milwaukee, WI, 53226
tel. 414-456-8802, fax 414-456-6595