[Bioclusters] Daemonizing blast, ie running many sequences through 1 process

Chris Dwan (CCGB) bioclusters@bioinformatics.org
Fri, 7 Nov 2003 09:23:25 -0600 (CST)


> Solaris has a CacheFS filesystem used to cache NFS mostly read filesystems
> on the local disk for slow connections including PPP. This would be ideal
> for your situation, however I don't know if Linux has anything similar.
>
> http://docs.sun.com/db/doc/806-4073/6jd67r9jd?a=view

It may be that my experience with Solaris is out of date, or that I failed
to properly parameterize it, but I remember there being a limit on the
volume of data that CacheFS would accept (the cache size, as it were).
That limit was well below the size of any of the larger target sets we
deal with, so using cachefs as a solution to data staging led to
thrashing, particularly when we started splitting up the targets to better
parallelize our searches.

I'm curious to know if this is still the case.

Of course, a truly brilliant resource scheduler would take into account
the contents of the file cache when deciding where to run a particular
job...

-Chris Dwan
 University of Minnesota