[Bioclusters] SGE/MPI on OS X
Chris Dagdigian
bioclusters@bioinformatics.org
Tue, 20 Jul 2004 16:53:41 -0400
{ My $.02 }
SGE comes with preconfigured "example" templates for both mpich and pvm
integration. Take a look in $SGE_ROOT/examples/mpi/ for the MPI files.
My experience with parallel environments within grid engine is that they
are largely application specific in that most times you need to
configure a discrete parallel environment within Grid Engine for each
app you hope to run in the cluster. The reason for this is that each app
often needs customized start/stop/cleanup commands that often don't
generalize all that well.
Your best bet initially is to take things in phases,
First: get your MPI application running on the cluster outside of Grid
Engine
Next: Go for "loose integration" of a parallel environment with SGE
With "loose" integration all SGE is responsible for is finding the
correct number of host and job slots and then generating a custom mpi
hostsfile that your app must "honor" when it runs.
The nice thing about loose integration is that it is easy to set up --
SGE may output the hostfile in a format that your app can recognize and
use right away or you may have to take the simple extra step of writing
a prolog method script in your PE that handles the task of "translating"
the machinefile format into one that is recognized by the applications.
The usage is pretty simple:
$ qsub -pe myParallelEnvironment 10 ./my-10-CPU-parallel-job.sh
When SGE launches the job the location of the custom hostfile will be
visible as an environment variable. Your script then takes that file and
passes it to mpirun or the equiv parallel program launcher.
The downside to loose integration is that SGE does not manage or deal
with the parallel job at all and thus can't get good accounting stats or
cleanup the aftermath of runaway jobs.
That is why people often try to achieve "tight integration" which is
when SGE is responsible for actually launching and managing the parallel
job and all it's children.
Tight integration is often pretty hard to get going robustly.
-Chris
David Adelson wrote:
> Does anyone on this listserv have any experience integrating SGE and
> MPI within an OS X cluster?
>
> Specifically we have user who wants to run the mpi parallelized version
> of tree-puzzle on our OS X cluster that is currently managed using SGE.
> While I have seen a preliminary integration of LAM-MPI and SGE on
> http://gridengine.sunsource.net/project/gridengine/howto/lam/
> SGE_LAM_Integration.html it is not clear how straightforward this might
> be in the real world.
>
> Any firsthand info and experience you want to share would be welcome.
>
> Cheers,
>
> Dave Adelson
> Texas A&M University
>
> _______________________________________________
> Bioclusters maillist - Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
--
Chris Dagdigian, <dag@sonsorol.org>
Independent life science IT & informatics consulting
Office: 617-666-6454, Mobile: 617-877-5498, Fax: 425-699-0193
PGP KeyID: 83D4310E Yahoo IM: craffi Web: http://bioteam.net