[Bioclusters] BLAST job time estimates
Tim Cutts
bioclusters@bioinformatics.org
Tue, 1 Jun 2004 14:00:07 +0100
On 1 Jun 2004, at 1:33 pm, Micha Bayer wrote:
> Hi Tim,
>
> thanks for that. Can you just clarify what n and m are in your response
> below?
For a given pair of sequences being aligned, n & m are the lengths of
the two sequences. So in the case of your blast search, you need to
know the lengths of the largest query sequence and the largest target
sequence.
> It looks like I stuck with doing the time prediction because we are
> plugging into an existing cluster with existing rules, much as I would
> like to avoid this issue altogether.... :-)
All I can suggest then is an iterative procedure - submit jobs with a
very conservative estimate of CPU time. They'll get low priority, but
that's better than them being killed because they've been running too
long. Then reduce the requirement when you've got a feel for the real
requirements of the job.
Tim
PS. I wish the powers that be would let me be as draconian with our
cluster as your guys are. It would solve a whole heap of trouble. :-)
--
Dr Tim Cutts
Informatics Systems Group
Wellcome Trust Sanger Institute
Hinxton, Cambridge, CB10 1SA, UK