Difference between revisions of "Ganga basic usage"

Revision as of 11:52, 11 January 2010

Introduction

The [Ganga basic usage] will help beginners to understand how to use this tool for managing computational jobs running locally or on the Grid. This guide provides step-by-step instructions for running simple "HelloWorld" job through GANGA. Users will run GANGA on a NIKHEF desktop (e.g. elel22.nikhef.nl) and submit jobs to Stoomboot (a PBS cluster) and to the LCG.

As Ganga is also a job management tools for end users, this wiki will also introduce few useful commands for managing Ganga jobs.

Requirements

You need to have a proper privilege for submitting jobs to a local cluster (e.g. NIKHEF account for Stoomboot and/or CERN account for lxbatch)
You need to have a valid grid certificate registered in a Virtual Organization (e.g. ATLAS) for running jobs on the Grid.

Preparation

password-less login between desktop and Stoomboot nodes

This step is needed for managing jobs on Stoomboot with Ganga.

Starting GANGA session

For NIKHEF users

% source /project/atlas/nikhef/dq2/dq2_setup.sh.NIKHEF
% export DPNS_HOST=tbn18.nikhef.nl
% export LFC_HOST=lfc-atlas.grid.sara.nl
% source /project/atlas/nikhef/ganga/etc/setup.[c]sh
% ganga --config-path=/project/atlas/nikhef/ganga/config/Atlas.ini.nikhef

Every time you start with a clean shell, and you'll need to setup ganga with the lines given right above.

For CERN lxplus users
```
% source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh
% ganga
```
More detail for CERN users can be found here: http://ganga.web.cern.ch/ganga/user/index.php

The last command loads a system-wide ATLAS-specific configuration for your Ganga session. You can override the system-wide configuration by providing a ~/.gangarc file. The template of the ~/.gangarc file can be generated by:

% ganga -g

If you see the following prompt:

*** Welcome to Ganga ***
Version: Ganga-5-1-1
Documentation and support: http://cern.ch/ganga
Type help() or help('index') for online help.

This is free software (GPL), and you are welcome to redistribute it
under certain conditions; type license() for details.

In [1]:

you are already in a GANGA session. The GANGA session is actually an IPython shell with GANGA specific extensions (modules), meaning that you can do programming (python only, of course) inside the GANGA session.

Leaving GANGA session

To quit from a GANGA session, just press CTRL-D.

Getting familiar with GANGA

My first Grid job running a HelloWorld shell script

Now go to your project directory

cd /project/atlas/Users/yourusernamehere

and create 'myscript.sh'

#!/bin/sh
echo 'myscript.sh running...'
echo "----------------------"
/bin/hostname
echo "HELLO PLANET!"
echo "----------------------"

and the file 'gangaScript.py'. Do not forget to modify the following to your directory structure+

In[n]: j = Job()
In[n]: j.application=Executable()
In[n]: j.application.exe=File('/project/atlas/Users/yourusernamehere/myscript.sh')
In[n]: j.backend=LCG()
In[n]: j.submit()

This Ganga Job means the following

  * Line 1 defines the job
  * Line 2 sets it as an Executable
  * Line 3 tell which file to run
  * Line 4 Tell where the job should run
  * Line 5 submits the job

The imprtant point is here that we have chosen LCG() as backend, i.e. the script will be executed on the grid. Now start ganga again and submit the job to the LCG-grid

In[n]: execfile("./gangaScript.py")

the status of the job can be monitored with

In[n]: jobs

After the job is submitted, GANGA is now responsible for monitoring your jobs when it's still running; and for downloading output files (e.g. stdout/stderr) to the local machine when the job is finished.

When your job is completed, the job's output is automatically fetched from the Grid and stored in your gangadir directory. The exact output location can be found by:

In[n]: j.outputdir
Out[n]: /project/atlas/Users/yourusernamehere/gangadir/workspace/yourusernamehere/LocalAMGA/0/output

if 0 was the job ID. This was our first grid-job submitted via ganga!

Working with historical jobs

GANGA internally archive your previously submitted jobs (historical jobs) in the local job repository (i.e. gangadir) so that you don't have to do bookkeeping by yourself. You can freely get in/out GANGA and still have your historical jobs ready for your future work.

The first thing to work with your historical job is to get the job instance from the repository as the following:

In [n]: jobs
Out[1]: 
Job slice:  jobs (12 jobs)
--------------
# fqid      status        name   subjobs      application          backend  backend.actualCE                                                 
#   17   submitted                  1000       Executable              LCG                                                 
#   18   submitted                  2000       Executable              LCG                                                                                     
#   20   completed                    10       Executable              LCG
#   28   submitted                             Executable              LCG
#   29   submitted    test_lcg                 Executable              LCG

The table above lists the historical jobs in your GANGA repository indexed by fqid. For example, if you are interested in the job with id 29, you can get the job instance by

In [n]: j = jobs(29)

then you are all set to work with the job.

Please note that you CANNOT change the attributes of a historical job.

More GANGA jobs to run on different platforms

Now try the following commands in the Ganga shell to gets your hands dirty :) Try to find where the second job runs.

In [n]: j = Job()
In [n]: j.backend=Local()
In [n]: j.submit()
In [n]: jobs

In [n]: j = j.copy()
In [n]: j.backend=PBS()
In [n]: j.backend.queue = 'test'
In [n]: j.submit()
In [n]: jobs

If you run Ganga on a NIKHEF desktop, the PBS backend should be configured for submitting jobs to the Stoomboot cluster.

After job submission

Checking job status

GANGA automatically polls the up-to-date status of your jobs and updates local repository accordingly. A notification will pop up to the user when the job status is changed.

In addition, you can get a job summary table by:

In [n]: jobs

or a summary table for subjobs (you won't have subjobs if you don't use Splitter with the job, for more advanced application, the Splitter may be used):

In [n]: j.subjobs

Killing and removing jobs

You can kill a job by calling

In [n]: j.kill()

Ganga keeps the killed job still referable so the working directory and job registry of the removed jobs are still kept in Ganga (that can take your disk space). So if you want to really erase everything related to this job from Ganga, you can remove a job by

In [n]: j.remove()

Failing jobs manually

Some unexpected issues in the job may cause Ganga unable to update the job status to failed as it should be. In this case, you can manually fail the job in force

In [n]: j.force_status("failed", force=True)

This can avoid Ganga to keep polling the status of the problematic job which may be gone from the backend system.

The basic trouble shooting

GANGA tries to bring the stdout/err back to the client side even when the job is failed remotely on the Grid. So for the failed jobs, you can check them as the following for trouble shooting:

In [n]: j.peek('stdout','less')
In [n]: j.peek('stderr','cat')

or

In [n]: j.peek('stdout.gz','zcat')
In [n]: j.peek('stdout.gz','zcat')

for the LCG jobs.

More actions on a job

try to type: j.<TAB Key> in your Ganga session, the auto-completion feature of IPython will tells you the exported methods of the Ganga job object.

Or you can get help on the job object, for example:

In [n]: j = jobs[-1]
In [n]: help(j)