[BiO BB] Open science via bioinformatics competitions

Anthony Goldbloom anthony.goldbloom at kaggle.com
Mon May 17 22:24:19 EDT 2010

For three weeks Kaggle, a platform for data prediction competitions, has
been running a bioinformatics competition
(http://kaggle.com/hivprogression). The competition requires competitors
to pick markers in HIV's genetic sequence that predict a change in viral

The results have been better than we could have hoped for. Within a week
and a half, the best submission had already outdone the best methods in
the literature. (This is particularly surprising when you consider that
the prize is just $500 and the opportunity to co-author a paper with the
competition host.) There's a short post about the results so far on the
Kaggle blog

This early result suggests that Kaggle has hit on a great way to do open
science. A contributing factor in the success of the Predict HIV
Progression competition is the degree of cooperation on the
competition's forum. Moreover, by hosting this competition, William has
opened up his dataset to other scientists, giving them access to a
problem they wouldn't otherwise know about. 

Kaggle doesn't control the results - their fate is up to the competition
host. William is planning on open sourcing the winning method to the
Predict HIV Progression competition.

We want to repeat this feat so we are looking for others to open up
their problems. If you are interested, please get in touch
(anthony.goldbloom at kaggle.com). 

More information about the BBB mailing list