[Bioclusters] SSE2 HMMer
Ian Korf
bioclusters@bioinformatics.org
Thu, 26 Jun 2003 10:43:56 +0100
On Wednesday, June 25, 2003, at 11:07 PM, David Huen wrote:
>
> As for the Apple-Genentech Blast speedup - it is bunkum. The
> improvements
> are solely from a superior algorithm and I did port their version over
> to
> x86 and it shows the same speedups too. I am very disappointed that
> they have repeated these claims for the G5 as I had spoken to one of
> their
> bioinformatics support team about it early this year. He adamantly
> denied
> that there was anything in their claim that remotely implied it was
> down
> to their hardware and that they had provided the algorithmic
> improvements
> back to the community. I left it as that. I regret it will force me
> to
> go back and repeat all those measurements and this time document it
> publically with sources and all.
>
> Actually their changes are not all that useful in that there is no
> effect
> at the default wordlength and I think it is not sensible to use Blast
> at
> large wordlengths - what would you achieve that other algorthms won't
> do
> better (thinking SSAHA)?
I completely agree that the using BLAST with a word size of 40 is not a
good idea and that SSAHA or maybe BLAT is better for nearly identical
sequences. Still, sometimes it's easier to use a program you know well
rather than switch to something less familiar. So AG-BLAST does have
some utility with large word sizes.
My big problem with the benchmark is that the greatest gain in speed is
actually at short word lengths. For cross-species work, I use a word
size of 9 or 10. Here, AG-BLAST is 5-8 times faster. Compared to a dual
Xeon (without similar code optimizations), it may be even faster than
that. This is really useful, and the idiots doing the benchmarks don't
display this. You can see the speed difference in the graph they
present, but they don't talk about relative speed at each word size.
Here's an experiment of mapping a C. elegans transcript against the C.
briggsae genome. Data is from the BLAST book by myself, Mark Yandell,
and Joey Bedell from O'Reilly & Associates (shameless plug for the book
coming out in about a month).
W speed
--- -----
8 1.5
9 5.3
10 8.5
11 1.0
15 1.0
20 1.4
30 2.3
40 2.8
-Ian