[Bioclusters] Versions of Blast that run on a cluster?
Bernard Li
bli at bcgsc.ca
Wed Jan 5 13:39:56 EST 2005
Hi Malay:
Are there any documentations and/or papers which describe such a setup?
I would assume that there would be general interest in seeing how such a
setup could be implemented.
I was thinking, instead of duplicating ALL the available databases to
the local HD, could some file-staging utlity be used to simply stage the
database to be BLASTed against? Obviously the file-staging utlity has
to work really quick on the cluster for this method to be viable.
Thanks,
Bernard
> -----Original Message-----
> From: bioclusters-bounces at bioinformatics.org
> [mailto:bioclusters-bounces at bioinformatics.org] On Behalf Of Malay
> Sent: Wednesday, January 05, 2005 10:23
> To: Clustering, compute farming & distributed computing in
> life science informatics
> Subject: Re: [Bioclusters] Versions of Blast that run on a cluster?
>
> Bernard Li wrote:
> > Hi Malay:
> >
> >
> >>Oops I forgot to mention the third option. This is for production
> >>machine for very high end scaling up and requires ample
> amount of disc
> >>space in each node. This is to have each node it's local copy of
> >>database. And use input spitting through SGE. This the best way to
> >>scale up to ~1000 jobs at a time. But because of database
> maintanance
> >>issue, this method is advisable of for dedicated BLAST farm.
> >
> >
> > You meant 'input splitting' right? And how would you
> accomplish that
> > using SGE? By scripting it in your job script?
> >
>
> I meant submit each sequence as a separate job.
>
> There is one more way of doing it. Which is called "pull technique".
> Where you store each sequences in a RDBMS. A demon runs on
> each node and
> pulls the sequence from the RDBMS and runs it against it's own local
> BLAST database, stores the result in a accesible place and
> marks the job
> in RDBMS as "done". A designated node then seek the RDBMS for
> job marked
> done and pulls the result for the place. This method is the most
> efficient of them all, and is used in BLAST server at NCBI.
>
>
> -Malay
>
> _______________________________________________
> Bioclusters maillist - Bioclusters at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
>
More information about the Bioclusters
mailing list