Difference between revisions of "Ganga basic usage"
|Line 192:||Line 192:|
In [n]: j.remove()
In [n]: j.remove()
=== More actions on a job ===
=== More actions on a job ===
Revision as of 11:52, 11 January 2010
The [Ganga basic usage] will help beginners to understand how to use this tool for managing computational jobs running locally or on the Grid. This guide provides step-by-step instructions for running simple "HelloWorld" job through GANGA. Users will run GANGA on a NIKHEF desktop (e.g. elel22.nikhef.nl) and submit jobs to Stoomboot (a PBS cluster) and to the LCG.
As Ganga is also a job management tools for end users, this wiki will also introduce few useful commands for managing Ganga jobs.
- You need to have a proper privilege for submitting jobs to a local cluster (e.g. NIKHEF account for Stoomboot and/or CERN account for lxbatch)
- You need to have a valid grid certificate registered in a Virtual Organization (e.g. ATLAS) for running jobs on the Grid.
password-less login between desktop and Stoomboot nodes
This step is needed for managing jobs on Stoomboot with Ganga.
Starting GANGA session
- For NIKHEF users
% source /project/atlas/nikhef/dq2/dq2_setup.sh.NIKHEF % export DPNS_HOST=tbn18.nikhef.nl % export LFC_HOST=lfc-atlas.grid.sara.nl % source /project/atlas/nikhef/ganga/etc/setup.[c]sh % ganga --config-path=/project/atlas/nikhef/ganga/config/Atlas.ini.nikhef
Every time you start with a clean shell, and you'll need to setup ganga with the lines given right above.
- For CERN lxplus users
% source /afs/cern.ch/sw/ganga/install/etc/setup-atlas.sh % ganga
More detail for CERN users can be found here: http://ganga.web.cern.ch/ganga/user/index.php
The last command loads a system-wide ATLAS-specific configuration for your Ganga session. You can override the system-wide configuration by providing a ~/.gangarc file. The template of the ~/.gangarc file can be generated by:
% ganga -g
If you see the following prompt:
*** Welcome to Ganga *** Version: Ganga-5-1-1 Documentation and support: http://cern.ch/ganga Type help() or help('index') for online help. This is free software (GPL), and you are welcome to redistribute it under certain conditions; type license() for details. In :
you are already in a GANGA session. The GANGA session is actually an IPython shell with GANGA specific extensions (modules), meaning that you can do programming (python only, of course) inside the GANGA session.
Leaving GANGA session
To quit from a GANGA session, just press CTRL-D.
Getting familiar with GANGA
My first Grid job running a HelloWorld shell script
Now go to your project directory
and create 'myscript.sh'
#!/bin/sh echo 'myscript.sh running...' echo "----------------------" /bin/hostname echo "HELLO PLANET!" echo "----------------------"
and the file 'gangaScript.py'. Do not forget to modify the following to your directory structure+
In[n]: j = Job() In[n]: j.application=Executable() In[n]: j.application.exe=File('/project/atlas/Users/yourusernamehere/myscript.sh') In[n]: j.backend=LCG() In[n]: j.submit()
This Ganga Job means the following
* Line 1 defines the job * Line 2 sets it as an Executable * Line 3 tell which file to run * Line 4 Tell where the job should run * Line 5 submits the job
The imprtant point is here that we have chosen LCG() as backend, i.e. the script will be executed on the grid. Now start ganga again and submit the job to the LCG-grid
the status of the job can be monitored with
After the job is submitted, GANGA is now responsible for monitoring your jobs when it's still running; and for downloading output files (e.g. stdout/stderr) to the local machine when the job is finished.
When your job is completed, the job's output is automatically fetched from the Grid and stored in your gangadir directory. The exact output location can be found by:
In[n]: j.outputdir Out[n]: /project/atlas/Users/yourusernamehere/gangadir/workspace/yourusernamehere/LocalAMGA/0/output
if 0 was the job ID. This was our first grid-job submitted via ganga!
Working with historical jobs
GANGA internally archive your previously submitted jobs (historical jobs) in the local job repository (i.e. gangadir) so that you don't have to do bookkeeping by yourself. You can freely get in/out GANGA and still have your historical jobs ready for your future work.
The first thing to work with your historical job is to get the job instance from the repository as the following:
In [n]: jobs Out: Job slice: jobs (12 jobs) -------------- # fqid status name subjobs application backend backend.actualCE # 17 submitted 1000 Executable LCG # 18 submitted 2000 Executable LCG # 20 completed 10 Executable LCG # 28 submitted Executable LCG # 29 submitted test_lcg Executable LCG
The table above lists the historical jobs in your GANGA repository indexed by fqid. For example, if you are interested in the job with id 29, you can get the job instance by
In [n]: j = jobs(29)
then you are all set to work with the job.
Please note that you CANNOT change the attributes of a historical job.
More GANGA jobs to run on different platforms
Now try the following commands in the Ganga shell to gets your hands dirty :) Try to find where the second job runs.
In [n]: j = Job() In [n]: j.backend=Local() In [n]: j.submit() In [n]: jobs In [n]: j = j.copy() In [n]: j.backend=PBS() In [n]: j.backend.queue = 'test' In [n]: j.submit() In [n]: jobs
If you run Ganga on a NIKHEF desktop, the PBS backend should be configured for submitting jobs to the Stoomboot cluster.
After job submission
Checking job status
GANGA automatically polls the up-to-date status of your jobs and updates local repository accordingly. A notification will pop up to the user when the job status is changed.
In addition, you can get a job summary table by:
In [n]: jobs
or a summary table for subjobs (you won't have subjobs if you don't use Splitter with the job, for more advanced application, the Splitter may be used):
In [n]: j.subjobs
Killing and removing jobs
You can kill a job by calling
In [n]: j.kill()
Ganga keeps the killed job still referable so the working directory and job registry of the removed jobs are still kept in Ganga (that can take your disk space). So if you want to really erase everything related to this job from Ganga, you can remove a job by
In [n]: j.remove()
Failing jobs manually
Some unexpected issues in the job may cause Ganga unable to update the job status to failed as it should be. In this case, you can manually fail the job in force
In [n]: j.force_status("failed", force=True)
This can avoid Ganga to keep polling the status of the problematic job which may be gone from the backend system.
The basic trouble shooting
GANGA tries to bring the stdout/err back to the client side even when the job is failed remotely on the Grid. So for the failed jobs, you can check them as the following for trouble shooting:
In [n]: j.peek('stdout','less') In [n]: j.peek('stderr','cat')
In [n]: j.peek('stdout.gz','zcat') In [n]: j.peek('stdout.gz','zcat')
for the LCG jobs.
More actions on a job
try to type: j.<TAB Key> in your Ganga session, the auto-completion feature of IPython will tells you the exported methods of the Ganga job object.
Or you can get help on the job object, for example:
In [n]: j = jobs[-1] In [n]: help(j)