Difference between revisions of "VO-specific software and modules"

From PDP/Grid Wiki
Jump to navigationJump to search
 
(31 intermediate revisions by the same user not shown)
Line 1: Line 1:
= Introduction =
+
=Introduction=
  
VOs sometimes have a need to deploy their own software at a particular site. For this the '''S'''oftware '''G'''roup '''M'''anager VOMS role is available. Not all VOs have this role and not all VOs with this roles are supported at each site. However, if you are an SGM for a VO which '''is''' supported at Nikhef then this HOWTO will explain how you can deploy your own software and add your own module to the available modules for the <tt>modules</tt> command.
+
VOs sometimes have a need to deploy their own software at a particular site. For this the '''S'''oftware '''G'''roup '''M'''anager VOMS role is available. Not all VOs have this role and not all VOs with this role are supported at each site. If you are a Software Group Manager for a particular VO then the VOMS role will have been assigned to you and you can generate a SGM-specific proxy.
  
In this HOWTO the VO <tt>vlemed</tt> was chosen as the example. This VO has a role
+
For these Software Group Managers this HOWTO will explain how you can deploy your own software and add your own module to the available modules for the <tt>modules</tt> command.
 +
 
 +
==VOMS Role==
 +
The HEP VOs have an SGM role, usually of the form
 +
  /Role=lcgadmin
 +
However, in this HOWTO the VO <tt>vlemed</tt> was chosen as the example. This VO has a role
 
  /Role=sgm
 
  /Role=sgm
available, which gives users who possess that role the right to install software in the VO specific software area, normally denoted using the environment variable <tt>VO_&lt;VO&gt;_SW_DIR</tt> e.g.
+
available, which gives users who possess that role the right to install software in the VO specific software area.
  VO_VLEMED_SW_DIR
+
To generate an SGM-proxy use
 +
$ voms-proxy-init --voms vlemed:/vlemed/Role=sgm
 +
You can view your current VOMS roles using
 +
$ voms-proxy-info -all | sed 's/^/ /'
 +
subject  : /O=dutchgrid/O=users/O=nikhef/CN=Jan Just Keijser/CN=proxy
 +
issuer    : /O=dutchgrid/O=users/O=nikhef/CN=Jan Just Keijser
 +
identity  : /O=dutchgrid/O=users/O=nikhef/CN=Jan Just Keijser
 +
type      : proxy
 +
strength  : 1024 bits
 +
path      : /tmp/x509up_u7651
 +
timeleft  : 11:59:33
 +
=== VO vlemed extension information ===
 +
VO        : vlemed
 +
subject  : /O=dutchgrid/O=users/O=nikhef/CN=Jan Just Keijser
 +
issuer    : /O=dutchgrid/O=hosts/OU=sara.nl/CN=voms.grid.sara.nl
 +
attribute : '''/vlemed/Role=sgm/Capability=NULL'''
 +
attribute : /vlemed/Role=NULL/Capability=NULL
 +
timeleft  : 11:59:33
 +
uri      : voms.grid.sara.nl:30003
 +
 
 +
==VO-specific software area==
 +
The VO-specific software area is denoted using the environment variable  
 +
VO_&lt;VO&gt;_SW_DIR
 +
e.g. for <tt>vlemed</tt> it is <tt>VO_VLEMED_SW_DIR</tt>.
 +
 
 +
At Nikhef the <tt>$VO_VLEMED_SW_DIR</tt> has the following permissions:
 +
$ cd $VO_VLEMED_SW_DIR
 +
$ ls -ald .
 +
drwxrwsr-t 4 root vlemedsm 4096 Dec 11 16:17 .
 +
This means the directory is writable only for members of the Unix group <tt>vlemedsm</tt>. When submitting a job
 +
to Nikhef using the SGM-proxy a special ''pool-account'' is chosen:
 +
$ id vlemsm00
 +
uid=70960(vlemsm00) gid=2058(vlemedsm) groups=2024(vlemed),2058(vlemedsm)
 +
So the SGM-proxy causes a mapping to a ''pool-account'' with access rights to install software in <tt>$VO_VLEMED_SW_DIR</tt>.
 +
 
 +
Just to verify: a "regular" <tt>vlemed</tt> poolaccount does not have these permissions:
 +
$ id vlemed00
 +
uid=53200(vlemed00) gid=2024(vlemed) groups=2024(vlemed)
 +
 
 +
'''Note''' Checking this before proceeding is a very good practice, as sites misconfigurations at this level occur quite frequently.
 +
 
 +
=Building the software=
 +
Before deploying new software we need to build and package it first on a local system:
 +
* Notes on building
 +
** The '''target platform''' is a RHEL5 64bit compatible system. The easiest approach is to build the software on such a system (e.g. <tt>ui.grid.sara.nl</tt>)
 +
** It is not possible to hardcode paths into the software: the <tt>VO_&lt;VO&gt;_SW_DIR</tt> points to different directories on different clusters. Try to ensure that your software is relocatable using environment variables. Most software allows for this.
 +
* Notes on packaging
 +
** In this HOWTO all software is packaged as a single '''tarball'''
 +
** The ''installation job'' then only has to download and extract the tarball in the right location.
 +
 
 +
For this HOWTO guide we will package a non-existing version 5.0 of the <tt>fsl</tt> package. The following directory structure has been set up for this package:
 +
&lt;local-dir&gt;/fsl-5.0
 +
&lt;local-dir&gt;/fsl-5.0/bin
 +
&lt;local-dir&gt;/fsl-5.0/bin/fsl
 +
where <tt>fsl-5.0/bin/fsl</tt> is a dummy script.
 +
 
 +
Packaging this software is very easy
 +
$ cd &lt;local-dir&gt;
 +
$ tar czvf ~/my-fsl-5.0.tar.gz fsl-5.0
 +
We then upload this tarball to a public webserver (or gridftp server):
 +
$ scp ~/my-fsl-5.0.tar.gz ~/public_html
 +
 
 +
=Deploying the software=
 +
To deploy the software we use this <tt>JDL</tt> file:
 +
Executable = "deploy.sh";
 +
Stdoutput = "stdout";
 +
StdError = "stderr";
 +
InputSandbox = {"deploy.sh"};
 +
OutputSandbox = {"stdout","stderr"};
 +
with this <tt>deploy.sh</tt> script:
 +
#!/bin/bash
 +
# Set a sane umask, just in case
 +
umask 0022
 +
 +
cd $VO_VLEMED_SW_DIR
 +
wget http://www.nikhef.nl/~janjust/my-fsl-5.0.tar.gz || exit 1
 +
tar xzf my-fsl-5.0.tar.gz || exit 2
 +
# make sure the permissions are right
 +
chmod -R u+rw,g+rw,o+r-w fsl-5.0
 +
# List the directory afterwards for inspection
 +
ls -l
 +
Run this job with an SGM-proxy like any other job at the cluster where you want to install it, e.g.
 +
$ glite-wms-job-submit -d janjust.sgm -r gazon.nikhef.nl:2119/jobmanager-pbs-short deploy.jdl
 +
Wait for completion and check the <tt>stdout</tt> and <tt>stderr</tt> files for any errors.
 +
 
 +
=Adding a module=
 +
All VO-specific modules need to be installed in
 +
VO_&lt;VO&gt;_SW_DIR/modules
 +
Only modulefiles installed in this directory will be automagically picked up by the worker node login scripts.
 +
 
 +
Here is the listing of a sample modulefile
 +
#%Module1.0#####################################################################
 +
##
 +
## fsl 5.0 modulefile
 +
##
 +
 +
proc ModulesHelp { } {
 +
        global fslversion
 +
 +
        puts stderr "\tSet up the environment for FSL"
 +
        puts stderr "\n\tVersion $fslversion\n"
 +
}
 +
 +
module-whatis  "sets FSL environment"
 +
 +
# for Tcl script use only
 +
set    fslversion      5.0
 +
 +
set fsldir      "$env(VO_VLEMED_SW_DIR)/fsl-5.0"
 +
 +
prepend-path    PATH    "$fsldir/bin"
 +
setenv          FSLDIR  "$fsldir"
 +
 
 +
which adds support for a (non-existing) version 5.0 of the package <tt>fsl</tt>.
 +
 
 +
Note especially the
 +
$env(VO_VLEMED_SW_DIR)
 +
in this modulefile: all software needs be installed (and thus, relocatable!) relative to this directory. The <tt>$env(...)</tt> command
 +
is the <tt>modules</tt> (TCL) method to import an environment variable.
 +
 
 +
It is '''by far''' the easiest to develop and test new modulefiles on a local system. After making sure that the modulefile works you
 +
can then include it in the software tarball or create a new tarball, e.g. <tt>mypackage-''X''.''Y''-module.tar.gz</tt>.
 +
 
 +
For this HOWTO we will do the latter:
 +
$ cd &lt;local-dir&gt;
 +
$ ls
 +
fsl-5.0
 +
$ mkdir -p modules/fsl
 +
$ cp .../my-new-module-file modules/fsl/5.0
 +
Before packaging it we test it first
 +
$ export MODULEPATH=$PWD/modules
 +
$ module avail
 +
$ module load fsl/5.0
 +
When finished we package it:
 +
$ tar czf ~/my-fsl-5.0-module.tar.gz modules
 +
  $ cp ~/my-fsl-5.0-module.tar.gz ~/public_html
 +
And we deploy it in exactly the same manner as the actual software (see above).
 +
 
 +
=Using the module=
 +
In order to use our shiny new module we launch a normal job. Here is a listing of a sample job which checks the <tt>MODULEPATH</tt> parameter and a few other modules-related things:
 +
#!/bin/bash -l
 +
 +
id
 +
ls -l $VO_VLEMED_SW_DIR
 +
echo "MODULEPATH=$MODULEPATH"
 +
echo "## Listing available modules:"
 +
module avail 2>&1
 +
echo "## Loading fsl"
 +
module load fsl 2>&1
 +
echo "## Which modules are now loaded:"
 +
module list 2>&1
 +
 +
module unload fsl 2>&1
 +
echo "## Loading MY fsl module"
 +
module load fsl/5.0 2>&1
 +
echo "## Which modules are now loaded:"
 +
module list 2>&1
 +
 +
echo "## which fsl:"
 +
which fsl
 +
 
 +
The output of this job when run at the Nikhef cluster is:
 +
 
 +
uid=53207(vlemed07) gid=2024(vlemed) groups=2024(vlemed)
 +
total 16
 +
drwxr-sr-x 3 vlemsm05 vlemedsm 4096 Dec 11 16:17 fsl-5.0
 +
drwxr-sr-x 4 vlemsm05 vlemedsm 4096 Dec 11 16:22 modules
 +
MODULEPATH=/opt/vl-e/modules/Modules/versions:/opt/vl-e/modules/Modules/$MODULE_VERSION/modulefiles:
 +
            /etc/opt/vl-e/modulefiles::/data/esia/vlemed/modules
 +
## Listing available modules:
 +
---------------------- /opt/vl-e/modules/Modules/versions ----------------------
 +
3.2.6
 +
 +
----------------- /opt/vl-e/modules/Modules/3.2.6/modulefiles ------------------
 +
dot        module-cvs  module-info modules    null        use.own
 +
 +
-------------------------- /etc/opt/vl-e/modulefiles ---------------------------
 +
fsl/4.0              javagat/1.7.1        pl/5.6.64
 +
fsl/4.0.4            lam/7.1              r/2.6
 +
fsl/4.1              lam/7.1.4            r/2.6.2
 +
fsl/4.1.4            mcr/7.11            r/2.9
 +
gat/1.8              mesa3d/6.4          r/2.9.2
 +
gat/1.8.2            mesa3d/6.4.2        rmpi/0.5
 +
graphviz/2.18        mpitb/2.1            srb/3.4
 +
gt/4.0              mpitb/2.1.73        srb/3.4.2
 +
gt/4.0.8            mricro/1.39          vlet/1.0
 +
ibis/1.4            mricro/1.39.3        vlet/1.0.2
 +
itk/3.14            octave/2.1          vtk/4.4
 +
itk/3.14.0          octave/2.1.73        vtk/4.4.2
 +
itk/3.4              openmpi/1.3.2        vtk/5.4
 +
itk/3.4.0            openrdf-sesame/2.0  vtk/5.4.0
 +
java/1.6            openrdf-sesame/2.0.1 weka/3.4
 +
javagat/1.7          pl/5.6              weka/3.4.12
 +
 +
-------------------------- /data/esia/vlemed/modules ---------------------------
 +
fsl/5.0      mypackage/5.0
 +
## Loading fsl
 +
## Which modules are now loaded:
 +
Currently Loaded Modulefiles:
 +
  1) fsl/4.1.4
 +
## Loading MY fsl module
 +
## Which modules are now loaded:
 +
Currently Loaded Modulefiles:
 +
  1) fsl/5.0
 +
## which fsl:
 +
/data/esia/vlemed/fsl-5.0/bin/fsl
 +
 
 +
'''Notes'''
 +
* when adding a new version (e.g. '''5.0''') of an existing system-wide package (e.g. '''fsl''') this version does not become the default, as can be seen in the script output. To use the new version you have to explicitly specify the version number.
 +
* Completely new packages are picked up automatically:
 +
$ module load mypackage
 +
$ module list
 +
Currently Loaded Modulefiles:
 +
  1) fsl/4.1.4      2) mypackage/5.0

Latest revision as of 12:26, 7 September 2011

Introduction

VOs sometimes have a need to deploy their own software at a particular site. For this the Software Group Manager VOMS role is available. Not all VOs have this role and not all VOs with this role are supported at each site. If you are a Software Group Manager for a particular VO then the VOMS role will have been assigned to you and you can generate a SGM-specific proxy.

For these Software Group Managers this HOWTO will explain how you can deploy your own software and add your own module to the available modules for the modules command.

VOMS Role

The HEP VOs have an SGM role, usually of the form

 /Role=lcgadmin

However, in this HOWTO the VO vlemed was chosen as the example. This VO has a role

/Role=sgm

available, which gives users who possess that role the right to install software in the VO specific software area. To generate an SGM-proxy use

$ voms-proxy-init --voms vlemed:/vlemed/Role=sgm

You can view your current VOMS roles using

$ voms-proxy-info -all | sed 's/^/ /'
subject   : /O=dutchgrid/O=users/O=nikhef/CN=Jan Just Keijser/CN=proxy
issuer    : /O=dutchgrid/O=users/O=nikhef/CN=Jan Just Keijser
identity  : /O=dutchgrid/O=users/O=nikhef/CN=Jan Just Keijser
type      : proxy
strength  : 1024 bits
path      : /tmp/x509up_u7651
timeleft  : 11:59:33
=== VO vlemed extension information ===
VO        : vlemed
subject   : /O=dutchgrid/O=users/O=nikhef/CN=Jan Just Keijser
issuer    : /O=dutchgrid/O=hosts/OU=sara.nl/CN=voms.grid.sara.nl
attribute : /vlemed/Role=sgm/Capability=NULL
attribute : /vlemed/Role=NULL/Capability=NULL
timeleft  : 11:59:33
uri       : voms.grid.sara.nl:30003

VO-specific software area

The VO-specific software area is denoted using the environment variable

VO_<VO>_SW_DIR

e.g. for vlemed it is VO_VLEMED_SW_DIR.

At Nikhef the $VO_VLEMED_SW_DIR has the following permissions:

$ cd $VO_VLEMED_SW_DIR
$ ls -ald .
drwxrwsr-t 4 root vlemedsm 4096 Dec 11 16:17 .

This means the directory is writable only for members of the Unix group vlemedsm. When submitting a job to Nikhef using the SGM-proxy a special pool-account is chosen:

$ id vlemsm00
uid=70960(vlemsm00) gid=2058(vlemedsm) groups=2024(vlemed),2058(vlemedsm)

So the SGM-proxy causes a mapping to a pool-account with access rights to install software in $VO_VLEMED_SW_DIR.

Just to verify: a "regular" vlemed poolaccount does not have these permissions:

$ id vlemed00
uid=53200(vlemed00) gid=2024(vlemed) groups=2024(vlemed)

Note Checking this before proceeding is a very good practice, as sites misconfigurations at this level occur quite frequently.

Building the software

Before deploying new software we need to build and package it first on a local system:

  • Notes on building
    • The target platform is a RHEL5 64bit compatible system. The easiest approach is to build the software on such a system (e.g. ui.grid.sara.nl)
    • It is not possible to hardcode paths into the software: the VO_<VO>_SW_DIR points to different directories on different clusters. Try to ensure that your software is relocatable using environment variables. Most software allows for this.
  • Notes on packaging
    • In this HOWTO all software is packaged as a single tarball
    • The installation job then only has to download and extract the tarball in the right location.

For this HOWTO guide we will package a non-existing version 5.0 of the fsl package. The following directory structure has been set up for this package:

<local-dir>/fsl-5.0
<local-dir>/fsl-5.0/bin
<local-dir>/fsl-5.0/bin/fsl

where fsl-5.0/bin/fsl is a dummy script.

Packaging this software is very easy

$ cd <local-dir>
$ tar czvf ~/my-fsl-5.0.tar.gz fsl-5.0

We then upload this tarball to a public webserver (or gridftp server):

$ scp ~/my-fsl-5.0.tar.gz ~/public_html

Deploying the software

To deploy the software we use this JDL file:

Executable = "deploy.sh";
Stdoutput = "stdout";
StdError = "stderr";
InputSandbox = {"deploy.sh"};
OutputSandbox = {"stdout","stderr"};

with this deploy.sh script:

#!/bin/bash
# Set a sane umask, just in case
umask 0022

cd $VO_VLEMED_SW_DIR
wget http://www.nikhef.nl/~janjust/my-fsl-5.0.tar.gz || exit 1
tar xzf my-fsl-5.0.tar.gz || exit 2
# make sure the permissions are right
chmod -R u+rw,g+rw,o+r-w fsl-5.0
# List the directory afterwards for inspection
ls -l

Run this job with an SGM-proxy like any other job at the cluster where you want to install it, e.g.

$ glite-wms-job-submit -d janjust.sgm -r gazon.nikhef.nl:2119/jobmanager-pbs-short deploy.jdl

Wait for completion and check the stdout and stderr files for any errors.

Adding a module

All VO-specific modules need to be installed in

VO_<VO>_SW_DIR/modules

Only modulefiles installed in this directory will be automagically picked up by the worker node login scripts.

Here is the listing of a sample modulefile

#%Module1.0#####################################################################
##
## fsl 5.0 modulefile
##

proc ModulesHelp { } {
        global fslversion

        puts stderr "\tSet up the environment for FSL"
        puts stderr "\n\tVersion $fslversion\n"
}

module-whatis   "sets FSL environment"

# for Tcl script use only
set     fslversion      5.0

set fsldir      "$env(VO_VLEMED_SW_DIR)/fsl-5.0"

prepend-path    PATH    "$fsldir/bin"
setenv          FSLDIR  "$fsldir"

which adds support for a (non-existing) version 5.0 of the package fsl.

Note especially the

$env(VO_VLEMED_SW_DIR)

in this modulefile: all software needs be installed (and thus, relocatable!) relative to this directory. The $env(...) command is the modules (TCL) method to import an environment variable.

It is by far the easiest to develop and test new modulefiles on a local system. After making sure that the modulefile works you can then include it in the software tarball or create a new tarball, e.g. mypackage-X.Y-module.tar.gz.

For this HOWTO we will do the latter:

$ cd <local-dir>
$ ls 
fsl-5.0
$ mkdir -p modules/fsl
$ cp .../my-new-module-file modules/fsl/5.0

Before packaging it we test it first

$ export MODULEPATH=$PWD/modules
$ module avail
$ module load fsl/5.0

When finished we package it:

$ tar czf ~/my-fsl-5.0-module.tar.gz modules
$ cp ~/my-fsl-5.0-module.tar.gz ~/public_html

And we deploy it in exactly the same manner as the actual software (see above).

Using the module

In order to use our shiny new module we launch a normal job. Here is a listing of a sample job which checks the MODULEPATH parameter and a few other modules-related things:

#!/bin/bash -l

id
ls -l $VO_VLEMED_SW_DIR
echo "MODULEPATH=$MODULEPATH"
echo "## Listing available modules:"
module avail 2>&1
echo "## Loading fsl"
module load fsl 2>&1
echo "## Which modules are now loaded:"
module list 2>&1

module unload fsl 2>&1
echo "## Loading MY fsl module"
module load fsl/5.0 2>&1
echo "## Which modules are now loaded:"
module list 2>&1

echo "## which fsl:"
which fsl

The output of this job when run at the Nikhef cluster is:

uid=53207(vlemed07) gid=2024(vlemed) groups=2024(vlemed)
total 16
drwxr-sr-x 3 vlemsm05 vlemedsm 4096 Dec 11 16:17 fsl-5.0
drwxr-sr-x 4 vlemsm05 vlemedsm 4096 Dec 11 16:22 modules
MODULEPATH=/opt/vl-e/modules/Modules/versions:/opt/vl-e/modules/Modules/$MODULE_VERSION/modulefiles:
           /etc/opt/vl-e/modulefiles::/data/esia/vlemed/modules
## Listing available modules:
---------------------- /opt/vl-e/modules/Modules/versions ----------------------
3.2.6

----------------- /opt/vl-e/modules/Modules/3.2.6/modulefiles ------------------
dot         module-cvs  module-info modules     null        use.own

-------------------------- /etc/opt/vl-e/modulefiles ---------------------------
fsl/4.0              javagat/1.7.1        pl/5.6.64
fsl/4.0.4            lam/7.1              r/2.6
fsl/4.1              lam/7.1.4            r/2.6.2
fsl/4.1.4            mcr/7.11             r/2.9
gat/1.8              mesa3d/6.4           r/2.9.2
gat/1.8.2            mesa3d/6.4.2         rmpi/0.5
graphviz/2.18        mpitb/2.1            srb/3.4
gt/4.0               mpitb/2.1.73         srb/3.4.2
gt/4.0.8             mricro/1.39          vlet/1.0
ibis/1.4             mricro/1.39.3        vlet/1.0.2
itk/3.14             octave/2.1           vtk/4.4
itk/3.14.0           octave/2.1.73        vtk/4.4.2
itk/3.4              openmpi/1.3.2        vtk/5.4
itk/3.4.0            openrdf-sesame/2.0   vtk/5.4.0
java/1.6             openrdf-sesame/2.0.1 weka/3.4
javagat/1.7          pl/5.6               weka/3.4.12

-------------------------- /data/esia/vlemed/modules ---------------------------
fsl/5.0       mypackage/5.0
## Loading fsl
## Which modules are now loaded:
Currently Loaded Modulefiles:
  1) fsl/4.1.4
## Loading MY fsl module
## Which modules are now loaded:
Currently Loaded Modulefiles:
  1) fsl/5.0
## which fsl:
/data/esia/vlemed/fsl-5.0/bin/fsl

Notes

  • when adding a new version (e.g. 5.0) of an existing system-wide package (e.g. fsl) this version does not become the default, as can be seen in the script output. To use the new version you have to explicitly specify the version number.
  • Completely new packages are picked up automatically:
$ module load mypackage
$ module list
Currently Loaded Modulefiles:
  1) fsl/4.1.4       2) mypackage/5.0