[BiO BB] Linear Bioinformatics workflow?

Amir Karger akarger at CGR.Harvard.edu
Mon Oct 3 16:17:19 EDT 2005


Several people mentioned 2-D graphical workflow tool in a "Bioinformatics
workflow?" thread on bioclusters. (I'm redirecting my non-cluster-y question
here.) While still a newbie, I'm getting the impression that many
bioinformatics workflows are mostly linear, with obvious important
exceptions like conditions and loops. For example, I had a client last week
who wanted to script:

1 blast [sequence=..., program=...] > blast.out  
2 get hits from blast.out > blast.hits
3 find hits with 50-70% sequence identity from blast.hits > blast.good_hits
3 download/fastacmd sequences for IDs in blast.good_hits > hits.fasta
4 clustalw hits.fasta > publishable_result (OK, not really)

Our current model is to help people to write shell scripts, but that doesn't
work for all users. It seems like a two-dimensional workflow tool would be
overkill for the above. All I need is a tool that combines
Pise/iNquiry-style "select a bioinformatics tool, input parameters" with the
ability to save a set of commands. 

Of course, it would be much less powerful and flexible than the 2-D workflow
tools. But "protocols" (http://biopipe.org/protocols/) might be an easier
sell to computer-phobes than directed acyclic graphs. 

Is there anything out there that does this? I'd much rather steal than
build.

- Amir Karger
Computational Biology Group
Bauer Center for Genomics Research
Harvard University



More information about the BBB mailing list