On 6 Jan 2005, at 5:49 pm, Malay wrote: > Rayson Ho wrote: >> Gridengine currently has the "AND" operator job dependency: >> A,B -> C >> ie. we need to wait for job A and B finish before we start job C. >> There are discussions on the SGE dev mailing list about adding the OR >> job dependency: >> A|B -> C >> So job C will start as soon as job A or job B finishes. >> I am wondering if this is useful in bioinformatics job flows?? > > As far as bioinformatics goes I am afraid most of the bioinformatics > applications are embarassingly independant :) Although such dependancy > resolution issues will have it's niche application but I guess it's > very limited as far as bioinformatics goes. I don't think that's true - when you consider something like a gene annotation process, there are lots of dependencies. Consider what goes on with Ensembl; before any analyses are performed, the sequences have to be dusted and RepeatMasked. After that raw features such as blast hits, ab initio gene predictors and EST alignments can be calculated. Once the BLAST hits have been done, genewise alignments can be performed (using the BLAST results to narrow down the areas genewise needs to analyse). Only once the EST alignments, ab initio predictors and genewise are complete can the code be run to combine these into a coherent set of gene structures. Although each of these processes consists of thousands of independent jobs, each type of analysis is dependent on the completion of the previous ones. As it happens, all of these dependencies are handled in the Ensembl RuleManager rather than by the scheduling system. They're all AND dependencies as far as I can tell, and I've never needed anything other than AND dependencies in by own pipelines, but I wouldn't like to claim that OR dependencies aren't useful to someone. Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233