On Fri, 7 Nov 2003, Chris Dwan (CCGB) wrote: > It may be that my experience with Solaris is out of date, or that I failed > to properly parameterize it, but I remember there being a limit on the > volume of data that CacheFS would accept (the cache size, as it were). > That limit was well below the size of any of the larger target sets we > deal with, so using cachefs as a solution to data staging led to > thrashing, particularly when we started splitting up the targets to better > parallelize our searches. Theoretically CacheFS can take up to 90% of the filesystem space where it's configured for as the cache. This is theoretical of course since I've not used CacheFS in a Blast environment before, my nodes run Linux. Come to think of it, it doesn't really make much sense to run CacheFS if you have that amount of disk space to spare, might as well store the databases locally. > Of course, a truly brilliant resource scheduler would take into > account the contents of the file cache when deciding where to run a > particular job... We've been playing around with splitting up the load by storing certain databases locally and others over NFS. Scheduling is done by specifying certain resource flags on SGE, it's a lot easier to do it if you have control over how the people are submitting jobs (eg. over a web page) but not so easy with command line submissions.