[Bioclusters] Alternative algorithms for BLAST
Eitan Rubin
ERubin at CGR.Harvard.edu
Fri Mar 4 14:47:14 EST 2005
Hi,
I have been following the BLAST / MPI-BLAST discussion. I have great
experience with BLAT - as long as your problem fits the model, it is so much
quicker. It's soooo fast. One big drawback: no statistics is provided (i.e.
no e-value). Didn't bother me - I was looking for near identities.
Eitan
--------------------
Eitan Rubin, PhD
Head of Bioinformatics
The Bauer Center for Genomics Research
Harvard University
Tel: 617-496-5649 Fax: 617-495-2196
-----Original Message-----
From: bioclusters-request at bioinformatics.org
[mailto:bioclusters-request at bioinformatics.org]
Sent: Friday, March 04, 2005 12:09 PM
To: bioclusters at bioinformatics.org
Subject: Bioclusters Digest, Vol 5, Issue 5
Send Bioclusters mailing list submissions to
bioclusters at bioinformatics.org
To subscribe or unsubscribe via the World Wide Web, visit
https://bioinformatics.org/mailman/listinfo/bioclusters
or, via email, send a message with subject or body 'help' to
bioclusters-request at bioinformatics.org
You can reach the person managing the list at
bioclusters-owner at bioinformatics.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Bioclusters digest..."
Today's Topics:
1. Re: error while running mpiblast (Tim Cutts)
2. Re: error while running mpiblast (Aaron Darling)
----------------------------------------------------------------------
Message: 1
Date: Thu, 3 Mar 2005 22:41:57 +0000
From: Tim Cutts <tjrc at sanger.ac.uk>
Subject: Re: [Bioclusters] error while running mpiblast
To: "Clustering, compute farming & distributed computing in life
science informatics" <bioclusters at bioinformatics.org>
Message-ID: <59266712c8eca3506c94c20a06828efa at sanger.ac.uk>
Content-Type: text/plain; charset=US-ASCII; format=flowed
On 2 Mar 2005, at 5:59 am, James Cuff wrote:
> mpiblast works. Really very well for certain problems. There I said
> it.
>
> Guy and Tim will probably never forgive me... I think I may have been
> the
> original 'embarrassingly parallel is the only way, nothing else will
> ever
> give the throughput, yada, yada' advocate...
Aargh - he's gone over to the Dark Side!!!
Seriously, I agree with you. MPIBlast gets you fast turnaround for
single very large searches. I still think for the things Sanger are
doing we do better with the embarrassingly parallel model, but I
wouldn't claim that it's always the right solution (at least not any
more, he said, covering his tracks in case he's ever said exactly that
somewhere in the past)
>> Note: We have not built the mpiblast RPM for Itanium (nor for that
>> matter, any of our RPMs). Is there any interest in this? Curious.
>
> Shame they cost so darn much, well ours do, but folk keep demanding me
> to
> cram 64GB in them for something called whole genome assembly. I just
> can't for the life of me understand why they cost so much :-)
It's a good argument for getting people to write more memory-efficient
code:
"And just how many times your annual salary does the memory you're
asking for cost?"
Tim
--
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233
------------------------------
Message: 2
Date: Thu, 03 Mar 2005 17:30:34 -0600
From: Aaron Darling <darling at cs.wisc.edu>
Subject: Re: [Bioclusters] error while running mpiblast
To: "Clustering, compute farming & distributed computing in life
science informatics" <bioclusters at bioinformatics.org>
Message-ID: <42279E1A.2090507 at cs.wisc.edu>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Tim Cutts wrote:
>
> On 2 Mar 2005, at 5:59 am, James Cuff wrote:
>
>> mpiblast works. Really very well for certain problems. There I said
>> it.
>>
>> Guy and Tim will probably never forgive me... I think I may have
>> been the
>> original 'embarrassingly parallel is the only way, nothing else will
>> ever
>> give the throughput, yada, yada' advocate...
>
>
> Aargh - he's gone over to the Dark Side!!!
>
Haha! You guys crack me up :-) You will be assimilated. Resistance
is futile.
> Seriously, I agree with you. MPIBlast gets you fast turnaround for
> single very large searches. I still think for the things Sanger are
> doing we do better with the embarrassingly parallel model, but I
> wouldn't claim that it's always the right solution (at least not any
> more, he said, covering his tracks in case he's ever said exactly that
> somewhere in the past)
For whatever it's worth, my opinion is that there are far better --
faster *and* more sensitive -- local alignment algorithms than BLAST and
it's a shame they haven't come into wide use. If (when) NCBI takes one
of those better algorithms and calls it BLAST I bet it will get used.
People know NCBI BLAST, people trust NCBI BLAST. As long as that
remains true there will be a place for parallel NCBI BLAST, and with
huge databases even database segmentation will have a place. What I
really look forward to is for somebody to come up with a clever
compression and searchable indexing scheme that accounts for the huge
amount of redundancy in big databases like nt. Then we won't need
mpiBLAST anymore.
-Aaron
------------------------------
_______________________________________________
Bioclusters maillist - Bioclusters at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bioclusters
End of Bioclusters Digest, Vol 5, Issue 5
*****************************************
More information about the Bioclusters
mailing list