[Bioclusters] The Scriptome: a minimal-learning toolbox for manipulating biolog ical data

Amir Karger akarger at CGR.Harvard.edu
Fri May 20 14:40:44 EDT 2005

Our group has started a project called the Scriptome to provide experimental
biologists with tools for exploring and manipulating biological data.  (I
mentioned it in an exploratory email to the list a couple months ago.) We're
not targeting the complicated bioinformatics tools that EMBOSS, Bioperl,
etc. provide. Rather, we want to help bench biologists to "eyeball", filter,
format, and analyze the many large files they get from those and other
programs. While these tasks may be trivial for a programmer, not every bench
biologist has the time or inclination to become a programmer - especially
those who do computational analysis only occasionally.

http://cgr.harvard.edu/cbg/scriptome has an alpha version.  There's a
(small) set of tools up, including a "Fetch" tool that uses Bioperl. You
might also want to look at the Principles & FAQ pages where we talk about
the thinking behind the project & the particular solution we chose.

We taught a debut 3-hour session to two groups of five biologists in the
last few weeks, showing them how to find and use the tools. Newbies were
excited that they could filter files in five minutes, instead of five hours
of hand-editing. People who knew a bit of Perl appreciated that the tools
were not "black box"; they could tweak the tools to be more useful, while
learning more Perl on the side. Of course, it remains to be seen how many
people use The Scriptome after the class is over. 

We're currently using an extremely lightweight interface to allow quick tool
development, and to free occasional users from needing to learn & remember
yet another "intuitive" GUI. We may want to change that (e.g., a lite
version of Catherine Letondal's Pise
http://www.pasteur.fr/recherche/unites/sis/Pise/ ? SeWeR?), but any change
in interface requires wrestling with some frustrating pluses & minuses. For
example, the current interface requires no install whatsoever for almost all
the tools, just Perl on Unix.

This project struggles with big questions: teaching programming skills to
non-programmers, optimizing human-computer interface, and understanding
biologists' (changing) needs. But even if we can't solve all of these
problems, I believe this project can help to address an unmet need for a
large group of scientists.

We invite submissions of new tools (optionally including code) or
"protocols" (series of tools) to be added to the site. I'd also be happy to
get advice and feedback about the project in general.  

Thanks for listening,

- Amir Karger
Computational Biology Group
Bauer Center for Genomics Research
Harvard University

More information about the Bioclusters mailing list