[Bioclusters] SGE/MPI on OS X

Chris Dagdigian bioclusters@bioinformatics.org
Tue, 20 Jul 2004 16:53:41 -0400


{ My $.02 }

SGE comes with preconfigured "example" templates for both mpich and pvm 
integration. Take a look in $SGE_ROOT/examples/mpi/ for the MPI files.

My experience with parallel environments within grid engine is that they 
are largely application specific in that most times you need to 
configure a discrete parallel environment within Grid Engine for each 
app you hope to run in the cluster. The reason for this is that each app 
often needs customized start/stop/cleanup commands that often don't 
generalize all that well.

Your best bet initially is to take things in phases,

First: get your MPI application running on the cluster outside of Grid 
Engine

Next: Go for "loose integration" of a parallel environment with SGE

With "loose" integration all SGE is responsible for is finding the 
correct number of host and job slots and then generating a custom mpi 
hostsfile that your app must "honor" when it runs.

The nice thing about loose integration is that it is easy to set up -- 
SGE may output the hostfile in a format that your app can recognize and 
use right away or you may have to take the simple extra step of writing 
a prolog method script in your PE that handles the task of "translating" 
the machinefile format into one that is recognized by the applications.

The usage is pretty simple:

$ qsub -pe myParallelEnvironment 10 ./my-10-CPU-parallel-job.sh

When SGE launches the job the location of the custom hostfile will be 
visible as an environment variable. Your script then takes that file and 
passes it to mpirun or the equiv parallel program launcher.

The downside to loose integration is that SGE does not manage or deal 
with the parallel job at all and thus can't get good accounting stats or 
cleanup the aftermath of runaway jobs.

That is why people often try to achieve "tight integration" which is 
when SGE is responsible for actually launching and managing the parallel 
job and all it's children.

Tight integration is often pretty hard to get going robustly.


-Chris







David Adelson wrote:

> Does anyone on this listserv have any experience integrating SGE and  
> MPI within an OS X cluster?
> 
> Specifically we have user who wants to run the mpi parallelized version  
> of tree-puzzle on our OS X cluster that is currently managed using SGE.  
>  While I have seen a preliminary integration of LAM-MPI and SGE on  
> http://gridengine.sunsource.net/project/gridengine/howto/lam/ 
> SGE_LAM_Integration.html it is not clear how straightforward this might  
> be in the real world.
> 
> Any firsthand info and experience you want to share would be welcome.
> 
> Cheers,
> 
> Dave Adelson
> Texas A&M University
> 
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters

-- 
Chris Dagdigian, <dag@sonsorol.org>
Independent life science IT & informatics consulting
Office: 617-666-6454, Mobile: 617-877-5498, Fax: 425-699-0193
PGP KeyID: 83D4310E Yahoo IM: craffi Web: http://bioteam.net