You rang .... :) Brodie, Kent wrote: > Q: can someone point me to the results obtained by Joe Landman? (web > site, or..?) > > Many thanks, -- Kent C. Brodie, Medical College of Wisconsin > > > > >>-----Original Message----- >>From: bioclusters-bounces+brodie=mcw.edu at bioinformatics.org >>[mailto:bioclusters-bounces+brodie=mcw.edu at bioinformatics.org] On > > Behalf > >>Of Chris Dagdigian >>Sent: Thursday, February 03, 2005 12:28 PM >>To: Hrishikesh Deshmukh; Clustering, compute farming & distributed >>computing in life science informatics >>Subject: Re: [Bioclusters] Questions on mpiBLAST >> >> >>"parallelizing" blast across cluster nodes only results in significant >>speed gains if you are trying to solve a large problem set or have a >>massive target database that in no way shape or form can squeeze into >>physical memory on one node. >> >>The performance of BLAST is rate-limited first by how much RAM you > > have > >>and then by how fast your disk I/O system is. >> >>I think Joe Landman has also seen incredible variations in blast >>performance by experimenting with non-GNU architecture optimized >>compilers like those from IBM, Intel and the Portland Group. >> >>16 machines with 2Gb of RAM reading database files off of ethernet > > based > >>NFS is a "normal" compute farm config. >> >>Outside of mpiblast you could be seeing performance lags caused by > > your > >>network (if you are reading/writing via NFS or AFP) or by physical > > memory. > >>I'm not an expert on mpiblast but hope to start soon a personal > > project > >>to integrate it with grid engine mostly to satisfy my own curiosity. >> >>I agree with what Hrishikesh about your times -- you are searching > > with > >>a very small query set and you did not mention your target database. >> >>You may see better performance using one machine -- the first query > > will > >>be slow but the other queries will come back faster since most or part >>of the target database will still be mmapped or whatever in RAM. >> >>If you really want to test mpiblast out you need to pick a much larger >>query and target DB set. >> >>-Chris >> >> >> >> >>Hrishikesh Deshmukh wrote: >> >> >>>Hi, >>>I am no authority on BLAST, i guess you see a linear speedup > > increase > >>>only when the problem is huge, for 20 odd sequences mpiblast doesn't >>>play, your ncbi blast is good enough! Just curious are the results > > for > >>>ncbi and mpiblast for the same dataset (input) match exactly?! >>> >>>I am tryting to get BLAST and mpiBLAST running on Sun Grid, right > > now > >>>BLAST works in serial mode and mpiBLAST is kinds stuck! >>> >>>Cheers, >>>Hrishi >>> >>> >>>On Thu, 03 Feb 2005 11:45:45 -0500, Xiaowu Gai > > <xgai at genome.chop.edu> > >>wrote: >> >>>>Hi Everyone: >>>> >>>>We have a 16-node Xserve cluster, with 2GB memory on each node and > > dual > >>>>processors. I was able to install mpiBLAST on it, along with > > LAM/MPI. > >>>>However, the performance that I saw with some test runs has not been >> >>that >> >>>>good and quite confusing. Here is what I did: >>>> >>>>1.) I formatted the nt database: >>>> >>>>mpiformatdb -N 16 -i nt >>>> >>>>2.) I ran the mpiblast on one, two, five, ten, twenty, and more >> >>sequences >> >>>>(about 500bp each) and with the command: >>>> >>>>time mpirun N mpiblast -p blastn -d nt -i single.fa -o > > blast_results. > >>>>Here are the numbers: >>>> >>>>Single: 1m39.054s >>>>Two: 0m11.009s >>>>Five: 0m16.021s >>>>Ten: 0m46.591s >>>>twenty: 3m7.541s >>>>.. >>>> >>>>I am all confused. First of all, the performance is not that >> >>impressive. >> >>>>Secondly, the numbers are very confusing to me. Why is that a > > single > >>>>sequence query takes so much more time than a two (BTW, I reran the >> >>query of >> >>>>a single sequence right after the query of two and got similar > > results)? > >>And >> >>>>query of five takes only 5 seconds more than the query of two and > > so > >>on.. >> >>>>I am afraid that I have done something wrong and would really > > appreciate > >>any >> >>>>thoughts. >>>> >>>>Thanks >>>> >>>>Xiaowu >>>> >>>>_______________________________________________ >>>>Bioclusters maillist - Bioclusters at bioinformatics.org >>>>https://bioinformatics.org/mailman/listinfo/bioclusters >>>> >>> >>>_______________________________________________ >>>Bioclusters maillist - Bioclusters at bioinformatics.org >>>https://bioinformatics.org/mailman/listinfo/bioclusters >> >>-- >>Chris Dagdigian, <dag at sonsorol.org> >>BioTeam - Independent life science IT & informatics consulting >>Office: 617-665-6088, Mobile: 617-877-5498, Fax: 425-699-0193 >>PGP KeyID: 83D4310E iChat/AIM: bioteamdag Web: http://bioteam.net >>_______________________________________________ >>Bioclusters maillist - Bioclusters at bioinformatics.org >>https://bioinformatics.org/mailman/listinfo/bioclusters > > _______________________________________________ > Bioclusters maillist - Bioclusters at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com phone: +1 734 786 8423 fax : +1 734 786 8452 cell : +1 734 612 4615