[Bioclusters] blast server on OpenMosix cluster

bioclusters@bioinformatics.org bioclusters@bioinformatics.org
Mon, 5 Jan 2004 12:53:45 -0500 (EST)


Hello Lewis,

Thank you. For my instance, each tool has several parts. e.g blast
includes data, executables and web interfaces (html, .REAL, .cgi files)
and other tools contain at least data and web interface (python/perl/html
files). Which part do you think need to be put in every nodes which need
not? Except blast, we use postgresql server to house data.

Thanks

Hong
> Hi,
> Yes it is very possible to do all of what your talking about.  I did it
> using the inquiry at the danforthcenter.  www.danforthcenter.org  One
> can run jobs via command line or the web interface.  Here  is a short
> lis of applications that can take advantage of Sun Grid and qrsh.  You
> may also use Pise to add custom applications.  We went with Apples
> Xserve too.  So far all is well.  And its kind of nice to have all the
> support for setup.  Any one is welcome to stop in and look at our
> system if they want to contact me directly at
> jlewis@danforthcenter.org.  Here is that list of apps that are already
> configured in inquiry:
> abiview
> antigenic
> backtranseq
> banana
> biosed
> blast2
> btblastall
> btwisted
> cai
> chaos
> charge
> checktrans
> chips
> cirdna
> clustalw
> clustalw_convert
> codcmp
> coderet
> compseq
> cons
> cpgplot
> cpgreport
> cusp
> cutseq
> dan
> degapseq
> descseq
> diffseq
> digest
> distmat
> domainer
> dotmatcher
> dotpath
> dottup
> dreg
> einverted
> emma
> emowse
> eprimer3
> equicktandem
> est2genome
> etandem
> extractfeat
> extractseq
> fasta
> findkm
> fmtseq
> freak
> fuzznuc
> fuzzpro
> fuzztran
> geecee
> getorf
> glimmer
> helixturnhelix
> hmmalign
> hmmbuild
> hmmcalibrate
> hmmconvert
> hmmemit
> hmmer2sam
> hmmfetch
> hmmpfam
> hmmscore
> hmmsearch
> hmoment
> html4blast
> iep
> infoalign
> infoseq
> interface
> isochore
> lindna
> listor
> loadseq
> makehist
> marscan
> maskfeat
> maskseq
> matcher
> megamerger
> melting
> merger
> msbar
> mview_alig
> mview_blast
> mwfilter
> needle
> newcpgreport
> newcpgseek
> newseq
> notseq
> nrscope
> nthseq
> octanol
> oddcomp
> palindrome
> pasteseq
> patmatdb
> patmatmotifs
> pepcoil
> pepinfo
> pepnet
> pepstats
> pepwheel
> pepwindow
> pepwindowall
> phiblast
> plotcon
> plotorf
> polydot
> preg
> prettyplot
> prettyseq
> primersearch
> profit
> prophecy
> prophet
> pscan
> psiblast
> readseq
> recoder
> redata
> remap
> restover
> restrict
> revseq
> scope
> scopparse
> seqmatchall
> showalign
> showfeat
> showorf
> showseq
> shuffleseq
> sigcleave
> siggen
> sigscan
> silent
> splitter
> stretcher
> stssearch
> supermatcher
> syco
> tfscan
> tmap
> transeq
> trimest
> trimseq
> vectorstrip
> water
> whichdb
> wise2
> wobble
> wordcount
> wordmatch
> xblast
> On Jan 5, 2004, at 10:41 AM, hong.zhang@research.dfci.harvard.edu wrote:
>
>> Hi Chris,
>> Thanks for your message. It is really encourageable. My further
>> question is
>> we have other www tools rather than wwwblast installed on the cluster
>> so
>> whether SGE makes all tooks migratable or just a single-job cluster (i
>> mean only for blast such as mpiblast).
>>
>> And also how much space is needed to host blast data?
>>>
>>> Hong Zhang,
>>>
>>> There are several clusters doing Blast and Blast over WWW at Harvard.
>>> Contact me in private if you want contact information for the people
>>> running them.
>>>
>>> The Bauer Center for Genomics Research has a big cluster system
>>> running
>>> Platform LSF. (http://cgr.harvard.edu)
>>>
>>> The Harvard Stats department over in the Science Center is running
>>> Grid
>>> Engine on a small Linux cluster.
>>>
>>> The Flybase project people are using Grid Engine on Mac OS X (apple
>>> Xserves) for some lightweight web bioinformatics portal stuff
>>> (http://inquiry.flybase.harvard.edu)
>>>
>>> There are several more systems I've heard about or visited over at
>>> the Medical school etc.
>>>
>>> Regarding your questions:
>>>
>>> 1. wwwblast servers are easy to set up on clusters. For a lightweight
>>> system you can just take the LSF 'lsrun' or Grid Engine 'qrsh'
>>> commands
>>> and use them to wrap the call to the blastall executable. This will
>>> not
>>> work in a large setting as qrsh/lsrun will fail silently if there are
>>>  no
>>>  resources available; in that case you need to go asynchronous and
>>> get
>>> used to the batch system.
>>>
>>> 2. SGE easily runs on Debian linux
>>>
>>> Regards,
>>> Chris
>>>
>>>
>>>
>>>
>>> Hong Zhang wrote:
>>>
>>>> Thanks for your information. I read the article before.
>>>> I'd like to know
>>>> 1. whether it is possible to set up a wwwblast server on
>>>> cluster. Our goal is allow users to access blast database through
>>>> web page  instead of command line. I am not sure whether query from
>>>> web page can be  migrated.
>>>>
>>>> 2. whether SGE can be used in Debian.
>>>>
>>>>
>>>>  On Fri, 2 Jan 2004, Ron Chen wrote:
>>>>
>>>>
>>>>> It takes time to let openmosix to migrate your jobs.
>>>>> SGE is more suitable in the compute farm environment.
>>>>>
>>>>> "Integrating BLAST with Sun ONE Grid Engine Software"
>>>>> available at:
>>>>> http://developers.sun.com/solaris/articles/integrating_blast.html
>>>>>
>>>>> -Ron
>>>>>
>>>>> --- Hong Zhang <hzhang@research.dfci.harvard.edu>
>>>>> wrote:
>>>>>
>>>>>> But I have trouble make blast command line execute
>>>>>> in every node.
>>>>>>
>>>>>> And don't you think openmosix is suitable for blast
>>>>>> cluster? You suggested
>>>>>> SGE?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, 11 Dec 2003, Farul Mohd. Ghazali wrote:
>>>>>>
>>>>>>
>>>>>>> On Wed, 10 Dec 2003
>>>>>>
>>>>>> hong.zhang@research.dfci.harvard.edu wrote:
>>>>>>
>>>>>>>> I am working on set up a blast server on
>>>>>>
>>>>>> Debian/OpenMosix cluster with 4
>>>>>>
>>>>>>>> nodes. Actually it is totally new to me. So is
>>>>>>
>>>>>> there anyone can give me
>>>>>>
>>>>>>>> some advice? Thanks.
>>>>>>>
>>>>>>> I've used OpenMosix in the form of ClusterKnoppix
>>>>>>
>>>>>> some months back to test
>>>>>>
>>>>>>> it out. The setup was very easy, boot off the CD,
>>>>>>
>>>>>> configure some settings
>>>>>>
>>>>>>> and the rest of the nodes boot off the network.
>>>>>>
>>>>>> Applications are
>>>>>>
>>>>>>> automatically load balanced across nodes.
>>>>>>>
>>>>>>> While configuration and actual use was very easy,
>>>>>>
>>>>>> performance wasn't too
>>>>>>
>>>>>>> great. I think the main reason was that OpenMosix
>>>>>>
>>>>>> dynamically migrates
>>>>>>
>>>>>>> applications to the different nodes to
>>>>>>
>>>>>> automatically load balance the
>>>>>>
>>>>>>> system thus the overhead of migration for long
>>>>>>
>>>>>> running jobs suddenly
>>>>>>
>>>>>>> became apparent.
>>>>>>>
>>>>>>> To be honest, we didn't try to optimize it much
>>>>>>
>>>>>> and went to implement our
>>>>>>
>>>>>>> blast cluster with SGE and hopefully soon
>>>>>>
>>>>>> mpiblast.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Bioclusters maillist  -
>>>>>>
>>>>>> Bioclusters@bioinformatics.org
>>>>>>
>>>>> https://bioinformatics.org/mailman/listinfo/bioclusters
>>>>>
>>>>>> --
>>>>>> Hong Zhang, MIS
>>>>>> Bioinformatics Analyst
>>>>>> Dana Farber Cancer Institute
>>>>>> Harvard Medical School
>>>>>> 44 Binney St, D1510A
>>>>>> Boston MA 02115
>>>>>> Email: hong.zhang@research.dfci.harvard.edu
>>>>>> Phone: 617-632-3824
>>>>>> Fax: 617-632-3351
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioclusters maillist  -
>>>>>> Bioclusters@bioinformatics.org
>>>>>>
>>>>>
>>>>> https://bioinformatics.org/mailman/listinfo/bioclusters
>>>>>
>>>>>
>>>>> __________________________________
>>>>> Do you Yahoo!?
>>>>> New Yahoo! Photos - easier uploading and sharing.
>>>>> http://photos.yahoo.com/
>>>>> _______________________________________________
>>>>> Bioclusters maillist  -  Bioclusters@bioinformatics.org
>>>>> https://bioinformatics.org/mailman/listinfo/bioclusters
>>>>>
>>>>
>>>>
>>>
>>> --
>>> Chris Dagdigian, <dag@sonsorol.org>
>>> Independent life science IT & informatics consulting
>>> Office: 617-666-6454, Mobile: 617-877-5498, Fax: 425-699-0193
>>> PGP KeyID: 83D4310E Yahoo IM: craffi Web: http://bioteam.net
>>>
>>> _______________________________________________
>>> Bioclusters maillist  -  Bioclusters@bioinformatics.org
>>> https://bioinformatics.org/mailman/listinfo/bioclusters
>>
>>
>> --
>> Hong Zhang, MIS
>> Bioinformatics Analyst
>> Dana Farber Cancer Institute
>> Harvard Medical School
>> 44 Binney St, D1510A
>> Boston MA 02115
>> Email: hong.zhang@research.dfci.harvard.edu
>> Phone: 617-632-3824
>> Fax: 617-632-3351
>>
>>
>> _______________________________________________
>> Bioclusters maillist  -  Bioclusters@bioinformatics.org
>> https://bioinformatics.org/mailman/listinfo/bioclusters
>>
> John F. Lewis III
> www.danforthcenter.org
> jlewis@danforthcenter.org
> 314-587-1028
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters


-- 
Hong Zhang, MIS
Bioinformatics Analyst
Dana Farber Cancer Institute
Harvard Medical School
44 Binney St, D1510A
Boston MA 02115
Email: hong.zhang@research.dfci.harvard.edu
Phone: 617-632-3824
Fax: 617-632-3351