[Bioclusters] Nightly updated BLAST databases

Tue, 17 Dec 2002 09:40:14 +0800 (SGT)

> Generally this is not so hard.  You can even incorporate the update
> into a queuing system, as long as you use an O(1) data distribution
> system, such as the old ccp I had architected, or some newer stuff.  
> Use a priority based mechanism to schedule the update to occur between
> computing runs.  This requires some tuning/tweaking of the queuing
> system, but it is generally not that hard to do.

That was quite enlightening to a non system's person like me. One thing I
would still say though is that I personally wouldn't like to implement an
automatic update mainly because one would like to be able to reproduce a
whole bioinformatics protocol (especially if it is for publication) and in
order to do that one should know what version of a database his process
was running against.

Obviously if all we are talking about is blast dbs for a web blast
interface then it's fine, it's more if you are doing large-scale runs to
derive results based on things like best reciprocal hit, or even just best
hit, which can be altered by the version of the database you are
running...

Nonetheless very very interesting, thanks! We might tweak it to the
fullest, by automating as you suggest and then storing the information in
our pipeline, to be able to track it...

Elia

********************************
* http://www.fugu-sg.org/~elia *
* tel:    +65 6874 1467        *
* mobile: +65 9030 7613        *
* fax:    +65 6779 1117        *
********************************