[Bioclusters] Parallel blast

Wim Glassee bioclusters@bioinformatics.org
Fri, 7 Jun 2002 14:43:21 +0200


> -----Original Message-----
> From: bioclusters-admin@bioinformatics.org [mailto:bioclusters-
> admin@bioinformatics.org] On Behalf Of Chris Dwan (CCGB)
> Sent: vrijdag 7 juni 2002 14:14
> To: bioclusters@bioinformatics.org
> Subject: Re: [Bioclusters] Parallel blast
> 
> 
> > Is there or is there not a parallel version of blast available
> > somewhere?
> 
> (addressing a different part of the question than Chris D.)
> 
> I'm not aware of an MPI version of BLAST, nor would I use one if it
was
> available.  My problem is throughput, not response time on BLAST jobs.
In
> this situation, anything less than a parallel effficiency of one is
> wasting resources.
> 
> NCBI's BLAST has the "-a <NUM_CPUS>" option, which enables threading.
If
> your operating system is intelligent about SMP and you have more than
one
> CPU on the board, you can use it to run in parallel.  I haven't
studied it
> with any rigor, but thumbnail tests indicate that the parallel
> efficiency is fairly high for NUM_CPU <= 8.

Yep, that's what I hear.

> 
> This is great for decreasing wait time for web users, but it doesn't
> address my interests at all.  Not to beat it into the ground, but:
> It's throughput that we need.  That's why we're all so fond of queuing
> systems and processing farms, rather than high performance parallel
> machines.

My thoughts exactly. A coarse-grain parallel solution would therefore be
the best. Cut the sequence into pieces, chop the database into pieces,
run them all against each other and merge the output. And as far as I
know this hasn't been done yet.

Wim