[Bioclusters] SGE array job produces too many files and concatenation is slow

Mon Jan 30 13:08:48 EST 2006

Hi, my name is Shane Brubaker and I work at the Joint Genome Institute.

We are facing a problem with scalability on large numbers of short jobs 
involving SGE and a workflow system which we wrote.

We are running large numbers (10,000 to 100,000) jobs that are very short 
(1 second).  Admittedly, one second is too short
for a job and will produce a lot of overhead no matter what, but there are 
times when it is difficult to change our code to
produce longer jobs, and we'd like to provide some facility to do this with 
at least minimal overhead.

Also, when our file systems have more than a few thousand files in one 
directory things slow down tremendously, and it becomes impossible to
even ls the directory.  It also can crash our file servers.  We are using NFS.

I have come up with a strategy of using an array job and having the 
workflow system, which is written in perl, concatenate the
smaller task files to the end of a set of master logs and then remove the 
smaller files, using system calls, as I go.  This actually worked
quite well for 10,000 jobs, keeping the directory from growing and greatly 
improving performance.

However, when I went to 100,000 jobs the number of files grew faster than 
they could be concatenated, and the system is now slowly
going through that huge directory and trying to append the smaller files, 
even though the array job is long since finished.

I am wondering if anyone has experience with this and has a recommended 
solution.  I am also curious if the SGE folks have any plans to
add a master log capability for array jobs.  Finally, if you have any 
general advice on fast ways to append files and ways to deal with large 
directories,
I would really appreciate any advice.

Thanks,
Shane Brubaker