[Bioclusters] Alternative algorithms for BLAST

Fri Mar 4 14:47:14 EST 2005

Hi,

  I have been following the BLAST / MPI-BLAST discussion. I have great
experience with BLAT - as long as your problem fits the model, it is so much
quicker. It's soooo fast. One big drawback: no statistics is provided (i.e.
no e-value). Didn't bother me - I was looking for near identities.

  Eitan

--------------------
Eitan Rubin, PhD
Head of Bioinformatics
The Bauer Center for Genomics Research
Harvard University
Tel: 617-496-5649 Fax: 617-495-2196

-----Original Message-----
From: bioclusters-request at bioinformatics.org
[mailto:bioclusters-request at bioinformatics.org] 
Sent: Friday, March 04, 2005 12:09 PM
To: bioclusters at bioinformatics.org
Subject: Bioclusters Digest, Vol 5, Issue 5

Send Bioclusters mailing list submissions to
	bioclusters at bioinformatics.org

To subscribe or unsubscribe via the World Wide Web, visit
	https://bioinformatics.org/mailman/listinfo/bioclusters
or, via email, send a message with subject or body 'help' to
	bioclusters-request at bioinformatics.org

You can reach the person managing the list at
	bioclusters-owner at bioinformatics.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Bioclusters digest..."

Today's Topics:

   1. Re: error while running mpiblast (Tim Cutts)
   2. Re: error while running mpiblast (Aaron Darling)

----------------------------------------------------------------------

Message: 1
Date: Thu, 3 Mar 2005 22:41:57 +0000
From: Tim Cutts <tjrc at sanger.ac.uk>
Subject: Re: [Bioclusters] error while running mpiblast
To: "Clustering,	compute farming & distributed computing in life
	science informatics"	<bioclusters at bioinformatics.org>
Message-ID: <59266712c8eca3506c94c20a06828efa at sanger.ac.uk>
Content-Type: text/plain; charset=US-ASCII; format=flowed

On 2 Mar 2005, at 5:59 am, James Cuff wrote:

> mpiblast works.  Really very well for certain problems.  There I said 
> it.
>
> Guy and Tim will probably never forgive me...  I think I may have been 
> the
> original 'embarrassingly parallel is the only way, nothing else will 
> ever
> give the throughput, yada, yada' advocate...

Aargh - he's gone over to the Dark Side!!!

Seriously, I agree with you.  MPIBlast gets you fast turnaround for 
single very large searches.  I still think for the things Sanger are 
doing we do better with the embarrassingly parallel model, but I 
wouldn't claim that it's always the right solution (at least not any 
more, he said, covering his tracks in case he's ever said exactly that 
somewhere in the past)

>> Note: We have not built the mpiblast RPM for Itanium (nor for that
>> matter, any of our RPMs).  Is there any interest in this?  Curious.
>
> Shame they cost so darn much, well ours do, but folk keep demanding me 
> to
> cram 64GB in them for something called whole genome assembly.  I just
> can't for the life of me understand why they cost so much :-)

It's a good argument for getting people to write more memory-efficient 
code:

"And just how many times your annual salary does the memory you're 
asking for cost?"

Tim

-- 
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233

------------------------------

Message: 2
Date: Thu, 03 Mar 2005 17:30:34 -0600
From: Aaron Darling <darling at cs.wisc.edu>
Subject: Re: [Bioclusters] error while running mpiblast
To: "Clustering,	compute farming & distributed computing in life
	science informatics"	<bioclusters at bioinformatics.org>
Message-ID: <42279E1A.2090507 at cs.wisc.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Tim Cutts wrote:

>
> On 2 Mar 2005, at 5:59 am, James Cuff wrote:
>
>> mpiblast works.  Really very well for certain problems.  There I said 
>> it.
>>
>> Guy and Tim will probably never forgive me...  I think I may have 
>> been the
>> original 'embarrassingly parallel is the only way, nothing else will 
>> ever
>> give the throughput, yada, yada' advocate...
>
>
> Aargh - he's gone over to the Dark Side!!!
>
Haha!  You guys crack me up :-)   You will be assimilated.  Resistance 
is futile.

> Seriously, I agree with you.  MPIBlast gets you fast turnaround for 
> single very large searches.  I still think for the things Sanger are 
> doing we do better with the embarrassingly parallel model, but I 
> wouldn't claim that it's always the right solution (at least not any 
> more, he said, covering his tracks in case he's ever said exactly that 
> somewhere in the past)

For whatever it's worth, my opinion is that there are far better -- 
faster *and* more sensitive -- local alignment algorithms than BLAST and 
it's a shame they haven't come into wide use.  If (when) NCBI takes one 
of those better algorithms and calls it BLAST I bet it will get used.  
People know NCBI BLAST, people trust NCBI BLAST.  As long as that 
remains true there will be a place for parallel NCBI BLAST, and with 
huge databases even database segmentation will have a place.  What I 
really look forward to is for somebody to come up with a clever 
compression and searchable indexing scheme that accounts for the huge 
amount of redundancy in big databases like nt.  Then we won't need 
mpiBLAST anymore.

-Aaron

------------------------------

_______________________________________________
Bioclusters maillist  -  Bioclusters at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bioclusters

End of Bioclusters Digest, Vol 5, Issue 5
*****************************************