[Bioclusters] Re: Help on BLAST

Marc Rieffel bioclusters@bioinformatics.org
Wed, 28 Aug 2002 14:36:43 -0700


> Chris Dwan (CCGB) wrote:
>
>Great answer, Joe.  
>
>I have observed similarly "weird" behavior in trying to build up a
>comprehensive picture of alignments from supposedly common sequence
>fragments at varying lengths: 
>

Paracel BLAST (PB) automatically performs the process that you are 
describing.  Depending on the size of the query and database, it 
segments the query into sections, searches those sections in parallel, 
and reassembles the results.  There are three important differences, 
though, between the way PB does it and how you might do it with a 
"wrapper script".  

First, PB uses the (modified) NCBI source code to perform the sorting 
and statistics calculations during the merge, so you can be confident in 
the results.  PB is not affected, for example, by the number of digits 
displayed in an output report.  

Second, PB detects hits that may span the gaps between sub-queries, and 
uses (modified) NCBI code to recompute the complete alignments.  This 
way, you avoid repeated and incomplete hits.

Third, PB runs the sub-queries in parallel on a cluster system, allowing 
you to rapidly complete large BLAST analyses.

Paracel BLAST offers a complete turn-key solution for high-throughput 
parallel BLAST searches.  If you'd rather spend your time analyzing 
results instead of writing code and hacking perl scripts, it may be 
appropriate for you.

Marc Rieffel
Paracel
marc@paracel.com
626-744-2080