[Bioclusters] gridMathematica and bioinformatics

Andrew A. de Laix bioclusters@bioinformatics.org
Fri, 24 Oct 2003 09:20:13 -0700

	Below I quote one of our gridMathematica developers about your
problem.  I hope this satisfactorily answers your question, but if it
doesn't, please let me know.  I might be able to arrange a meeting between
you and one of our engineers if it would help you to get more out of
gridMathematica.  If you would like that, please let me know who the
official owner of the software is and I will see what I can do.


Andrew A. de Laix, PhD
Business Development Manager
Wolfram Research, Inc.
phone: 510-655-5806
email: delaix@wolfram.com
web: http://www.wolfram.com

"One easy way around this problem for a cluster administrator is to 
provide an init.m file with:

RemoteMachine[ <MachineName>, <ExternalLaunchCommand> ]

definitions. This way the users of the cluster do not really have to 
know anything, except:

LaunchSlaves[]      -- Would launch all slaves in the list 
LaunchSlaves[ {m1, ..., mk} ]
                    -- Would launch the slaves m1 through mk. Each of 
which are RemoteMachine definitions.

One can also make this dynamic, if there is some server to query for 
available machines.

Different clusters use different mechanisms to launch a new job and 
schedule a new job. With gridMathematica you don't really want a 
scheduler involved at all, since most "job" schedulers assume that the 
process will end. This is not the model used by gridMathematica. Here 
you launch a remote Mathematica, which you do not shut down until your 
whole program is done. A job within the gridMathematica framework is an 
expression to be evaluated. So in effect each running Mathematica is 
like a processor or computer in the system. And most clusters cannot 
virtualize enough resources to move this process around, e.g. IP 
addresses typically are not virtualized."

>Andrew Fant
>PharmaWulf: cluster builder
>     I was the project lead on the project that Chris referenced in the
>posting you were referencing.   At the moment, we don't have any
>biology-specific applications of Mathematica in use or under 
>development.  The issue we are running into with GridMathematica has to 
>do with the mechanism of spawning slave processes.  At present, the 
>software appears to be designed to require users to specify what hosts 
>to spawn processes on. Are there any plans to allow people using 
>GridMathematica in a scheduler/batch driven cluster to request some 
>sort of node list from the scheduler?  We have a work-around that uses 
>shell scripts and cut and paste, but that exposes a lot more of the 
>internal workings of the cluster to the end-users than anyone here is 
>comfortable with.
>       Andy