[Bioclusters] anyone using gridMathematica ?

Chris Dagdigian bioclusters@bioinformatics.org
Thu, 19 Jun 2003 16:44:25 -0400


Cheezy perl scripting to the rescue!

Since mathematica can accept input from external binaries and shell 
scripts I wrote a wrapper that pretends to be a parallel MPI job for the 
purposes of connecting to the batch scheduler and getting legit 
machinefile/hostlist back. The ouput from the cluster scheduler is 
mangled into mathematica syntax and piped into the math script.

This is not optimal because the gridMathematica job will still launch 
and run outside the control of the cluster scheduler and resource 
management layers. On the plus side however we at least allow for the 
cluster software to make informed decisions on which nodes the math jobs 
run on.

Here is an example of how to manually launch remote math kernels on 2 
hardcoded machines (compute5 and compute6) within gridMathematica:

> Needs["Parallel`Parallel`"]
> Needs["Parallel`VirtualShared`"]
> Needs["Parallel`Commands`"]
> $RemoteCommand = "ssh `1` math -mathlink";
> SetOptions[LaunchSlave,ConnectionType->LinkLaunch];
> LaunchSlave["compute05"]
> LaunchSlave["compute06"]
> TableForm[
>           RemoteEvaluate[{$ProcessorID, $MachineName, $SystemID, $ProcessID, $Version}],
>           TableHeadings ->{None,{"ID","host","OS","process","Mathematica Version"}}]
> CloseSlaves[]

This is the output of those commands when actually run:

> dag/gridmathematica> math < daginit.m 
> Mathematica 4.2 for Linux
> Copyright 1988-2002 Wolfram Research, Inc.
>  -- Terminal graphics initialized -- 
> 
> In[1]:= 
> In[2]:= 
> In[3]:= 
> In[4]:= 
> In[5]:= 
> In[6]:= 
> Out[6]= LinkObject[ssh compute05 math -mathlink, 1, 1]
> In[7]:= 
> Out[7]= LinkObject[ssh compute06 math -mathlink, 2, 2]
> In[8]:= 
> Out[8]//TableForm= 
>  
>>      ID   host        OS      process   Mathematica Version
>        1    compute05   Linux   15239     4.2 for Linux (August 23, 2002)
> 
>        2    compute06   Linux   15284     4.2 for Linux (August 23, 2002)


Cool. Now I just need a way to get the cluster load management layer to 
pick the machines for me and then somehow get that info into the 
mathematica scripts...

This is where cheezy perl scripting comes to the rescue:

1. write a script that pretends to be a parallel MPI app; submit it
2. slurp in the machinefile that the DRM creates just for that appp
3. Mangle the machine names into mathematica "LaunchSlave[]" syntax
4. Spew that info into a running mathematica script

The script is called 'gridwrap' and it takes only 1 command line 
argument -- the number of remote mathlink kernels you want to run.

Here it is running on the command line:

> dag/gridmathematica> ./gridwrap 5
> LaunchSlave["compute10"]
> LaunchSlave["compute10"]
> LaunchSlave["compute21"]
> LaunchSlave["compute21"]
> LaunchSlave["compute14"]

The final step was reading a mathematica book and learning that one can 
pipe 'structured' syntax directly into a running script by using the form:

  << "! myExternalBinary"

So- our new gridMathematica sample code looks like this and will launch 
5 mathematica mathlink kernels on cpu's that are designated by the linux 
cluster DRM layer:

> Needs["Parallel`Parallel`"]
> Needs["Parallel`VirtualShared`"]
> Needs["Parallel`Commands`"]
> $RemoteCommand = "ssh `1` math -mathlink";
> SetOptions[LaunchSlave,ConnectionType->LinkLaunch];
> << "! ./gridwrap 5"
> TableForm[
>           RemoteEvaluate[{$ProcessorID, $MachineName, $SystemID, $ProcessID, $Version}],
>           TableHeadings ->{None,{"ID","host","OS","process","Mathematica Version"}}]
> CloseSlaves[]
> 


The output looks like this:

> dag/gridmathematica> math < dag2.m
> Mathematica 4.2 for Linux
> Copyright 1988-2002 Wolfram Research, Inc.
>  -- Terminal graphics initialized -- 
> 
> In[1]:= 
> In[2]:= 
> In[3]:= 
> In[4]:= 
> In[5]:= 
> In[6]:= 
> Out[6]= LinkObject[ssh compute05 math -mathlink, 5, 5]
> 
> In[7]:= 
> Out[7]//TableForm= 
>  
>>      ID   host        OS      process   Mathematica Version
>        1    compute17   Linux   15078     4.2 for Linux (August 23, 2002)
> 
>        2    compute17   Linux   15083     4.2 for Linux (August 23, 2002)
> 
>        3    compute07   Linux   15313     4.2 for Linux (August 23, 2002)
> 
>        4    compute07   Linux   15318     4.2 for Linux (August 23, 2002)
> 
>        5    compute05   Linux   15272     4.2 for Linux (August 23, 2002)


Now if I could just get those darn Mathematica fonts to work under X11 
on my powerbook I could actually do some rendering or math or something 
useful :)

-Chris