Using the Grid/Lsg-matlab

From BiGGrid Wiki
Jump to navigation Jump to search

Intro

Yes, it is possible to use matlab on the grid. This page explains how to get set up on the Lifescience Grid.

In summary, the way to do this is to compile your matlab script(s). This compiled program can then be called in a jdl job description. You will need adapt your code to run properly on the grid nodes.

Prerequisites

  1. First, to be able to use matlab / compile on the user interface machine please request to become a member of the matlab group through grid.support@sara.nl
    1. If you compile it yourself that's not needed
  2. You need to login on one of the *.*.sara.nl user inface machines for the compiler to work.
  3. For your jobs to run, the worker node where job lands must have the matlab runtime environment available and loaded; you specifiy this in the wrapper script as "module load mcr". Note that not all grid nodes will have matlab capabilities. However on all the nodes of the Life Science Grid it is available.

Good practice: If you run Mathlab through an x server and you start up the compiler module, the license stays unavailable to others until mathlab is closed. As the number of available mathlab compiler licenses is limited it is advised to compile your projects from the command line interface of your user interface machine.


Loading necessary modules

Current version: Matlab r2009b is currently available by default. To load use:

module load matlab

The Matlab Compiler Runtime is also installed on all worker nodes of the LifeScience Grid clusters. Somewhere in your job scripts you will have to load the MCR.

Please make sure you load an mcr that is compatibile with the version of matlab you are using.

module load mcr

you can also type

mcc -help

Must do: disable multithreading

By default Matlab r2008b supports multithreading. On the Grid this is problematic. Each job is assigned a single thread. Using more than one means stealing resources from another job slot, and running with non optimal efficiency. Therefore you have to set the number of threads to 1 in your matlab code:

maxNumCompThreads(1)

If you are using more than a single thread per job you run the risk of your job being killed by adminstrators.

Magicsquare example

To get from your project to an executable compiled script, these are the steps you have to take, illustrated with a simple script to get a feel.

First setup the compilation environment; normally you only have to do this once:

mbuild -setup

NB We have noticed that running mbuild -setup on the command prompt is not always working. Running this command within the Matlab interpreter itself works.

Create directory for magicexample:

mkdir magic
cd magic

Copy the example magicsquare script:

cp /opt/matlab[version]/extern/examples/compiler/magicsquare.m .

Compile

mcc -m magicsquare

Run (if you didn't load the module mcr previously, you'll get an error message that it can't find the library, it might also occur that all compiler licenses are occupied, be patient and try again, when in doubt, don't hesitate to ask.)

./magicsquare 3

We ran the script on the ui; now we submit it with a jdl to land on a worker node on the grid.

Submitting to the grid

We are going to make a jdl file in the same directory

vi Magicsquare.jdl

that looks like this.

Executable = "magicsquare.sh";
Stdoutput = "stdout";
StdError = "stderror";
InputSandbox = {"magicsquare.sh","magicsquare"};
OutputSandbox = {"stdout","stderror"};
RetryCount = 0;

NB Be aware that the version of Matlab with which the script is compiled is linked to the version of MCR. Since more than one version is installed you should check which MCR version you need to use. An extra line in your jdl file would be:

Requirements = Member("nl.vl-e.poc-release-3",other.GlueHostApplicationSoftwareRunTimeEnvironment);

This will make sure that your job will only run on clusters which publish the nl.vl-e.poc-release-3 tag. At this moment MCR 7.11 (belonging to Matlab r2009b) is installed on all (64 bits) LifeScience Grid clusters, Gina, Nikhef and RUG.

And create the bash script magicsquare.sh like this

vi magicsquare.sh
#!/bin/bash --login
module load mcr
export MCR_CACHE_ROOT=$TMPDIR/MCR_CACHE
mkdir -p $MCR_CACHE_ROOT
chmod +x magicsquare
./magicsquare 3

Note 1 Sometimes the environment is not set correctly within jobs, while there are no problems interactively. By adding the --login option in the first line of your script you ensure that the interactive environment is loaded as well.

Note 2 Not all Grid clusters have the same setup. Therefore it is good practice to ensure that the MCR cache is written to a specified, job dependent directory. In some cases jobs running on the same cluster share the same home directory. In that case, the MCR directories might overlap leading to crashing of your MCR jobs.

Submit the jdl file using glite-wms-job-submit.