[Bioclusters] Clusters for bioinformatics... Some numbers

Thu, 30 Aug 2001 12:40:40 -0500 (CDT)

> Hi Jeremy,
> 
> could you please send more datails about those Condor standard universe
>  jobs?  Thanks!

What it actual does or how to install Condor? You basicaly have 2 universes
to choose from in Condor, Standard and Vanilla. Condor comes with a special
LD that you use to condor compile your source packages. That's right, you
can download any source package, any program and instead of make, make
install, you would do condor_compile make, make install. This wraps the
source executable to work in the Standard condor universe. Here's a sample
ripped from the Condor users manual:

Condor has four runtime environments (called a universe) from which to
choose. Of the four, two are likely choices when
    learning to submit a job to Condor: the standard universe and the
vanilla universe. The standard universe allows a job
    running under Condor to handle system calls by returning them to the
machine where the job was submitted. The standard
    universe also provides the mechanisms necessary to take a checkpoint and
migrate a partially completed job, should the
    machine on which the job is executing become unavailable. To use the
standard universe, it is necessary to relink the
    program with the Condor library using the condor_compile command. The
manual page for condor_submit on page  has
    details. 

    The vanilla universe provides a way to run jobs that cannot be relinked.
It depends on a shared file system for access to
    input and output files, and there is no way to take a checkpoint or
migrate a job executed under the vanilla universe.

Take a look at the users manual: http://www.cs.wisc.edu/condor/manual/v6.2/
for more information..

Right now we only use one application on the Condor cluster. Its a Java app,
so I can't relink the libraries to make it work in the standard universe. So
it runs in vanilla mode and this is how it operates:

1. The application outputs the seperate job files (1-40)
2. User submits all 40 jobs to the central manager
3. Condor rcp's the binary executable (something.exe) and the appropriate
job file to each node.
4. Condor starts the executable with the job file. When finished, condor
then rcp's the finished data into the user's home directory, or wherevert
they submitted the job(s) from.

The 'cluster' we use is a Beowulf comprised of 20 dual P3-500, 256 megs RAM
and 10 gig drives each. /home and /usr/local are NFS mounted which allows us
to install condor in /usr/local/condor and all nodes have condor access with
user in /home/condor.

Sorry if I am carrying on, but installing and running Condor has been one of
my big highlights. All that is left to deploy it on a higher scale and offer
more choices for each user.

-- 
Jeremy Mann
jeremy@biochem.uthscsa.edu