As we start to use our compute farm for biger and bigger tasks, I came to realising that the way that we are currently thinking about submitting our blast jobs is considerably sub-optimal. Obviously 1 run of 100 sequences against a database is much more efficient than 100 separate runs sgainst the same database. Has anyone developed scripts to sit inside some part of a queue submission system (in this case SGE) to make these things more efficient? I'm thinking along the lines of something that monitors the size and number of queries, notes the number of available nodes and batches the jobs up to match one against the other?



