Using GANGA with AMAAthena

From Atlas Wiki
Jump to navigation Jump to search

Introduction

This document gives an step-by-step instruction for running AMAAthena within GANGA on a NIKHEF desktop (e.g. ribble). AMAAthena is a Athena package providing ... developed at NIKHEF. GANGA is an official ATLAS grid utility for distributed data analysis.

The examples below assume that:

  1. Users have the following Athena job option files in the run directory of the AMAAthena package
    • AMAAthena_jobOptions.py
    • Trigger_jobOptions.py
  2. Users have the following AMA driver configuration files in the run directory of the AMAAthena package
    • exampleaod.conf
    • reader.conf
  3. Analysis is performed on an ATLAS dataset: fdr08_run2.0052280.physics_Muon.merge.AOD.o3_f8_m10

Preparation

  1. follow the CMT instructions to setup your CMTHOME directory
  2. checkout the AMAAthena package from CVS
  3. make sure you will start GANGA with a clear environment without any Athena and CMT setup

Starting GANGA

Typing the following commands within the directory: PhysicsAnalysis/AnalysisCommon/AMA/AMAAthena/cmt

% source /project/atlas/nikhef/dq2/dq2_setup.sh.NIKHEF
% export DPNS_HOST=tbn18.nikhef.nl
% export LFC_HOST=lfc-atlas.grid.sara.nl
% source /project/atlas/nikhef/ganga/etc/setup.[c]sh
% ganga --config-path=/project/atlas/nikhef/ganga/config/Atlas.ini.nikhef

GANGA magic functions for cmtsetup

Inside GANGA, one could deal with the complex CMT setup with two magic functions.

The following example shows how to setup the CMT environment for Athena 14.2.0 in 32 bit mode.

In [n]: config.Athena.CMTHOME = '/your/cmthome'
In [n]: cmtsetup 14.2.0,32
In [n]: setup

Running AMAAthena in GANGA

Creating new GANGA job

In [n]: j = Job()

Setting application

In [n]: j.application = AMAAthena()
In [n]: j.application.option_files += [ File('../run/AMAAthena_jobOptions.py'), File('../run/Trigger_jobOptions.py') ]
In [n]: j.application.driver_config.config_file = File('../run/exampleaod.conf')
In [n]: j.application.driver_config.include_file += [ File('../run/reader.conf') ]
In [n]: j.application.prepare()

Setting input data

  • StagerDataset When using the StagerDataset, the AMAAthena job will use the Athena FileStager to copy dataset files from a grid storage.
    In [n]: j.inputdata = StagerDataset()
    In [n]: j.inputdata.dataset += [ 'fdr08_run2.0052280.physics_Muon.merge.AOD.o3_f8_m10' ]
    
  • DQ2Dataset When using the DQ2Dataset, GANGA will handle the dataset file access externally from Athena.
    In [n]: j.inputdata = DQ2Dataset()
    In [n]: j.inputdata.dataset += [ 'fdr08_run2.0052280.physics_Muon.merge.AOD.o3_f8_m10' ]
    In [n]: j.inputdata.type = 'DQ2_DOWNLOAD'
    

Setting job splitter (optional)

The examples below ask each subjob to process on 2 files in maximum.

  • using StagerJobSplitter with StagerDataset
    In [n]: j.splitter = StagerJobSplitter()
    In [n]: j.splitter.numfiles = 2
    
  • using DQ2JobSplitter with DQ2Dataset
    In [n]: j.splitter = DQ2JobSplitter()
    In [n]: j.splitter.numfiles = 2
    

Setting computing backend

  • using Stoomboot cluster
    In [n]: j.backend = PBS()
    
  • using LCG
    In [n]: j.backend = LCG()
    

Submitting job

In [n]: j.submit()

After job submission

Checking job status

GANGA automatically polls the up-to-date status of your jobs and updates local repository accordingly. The notification of status change will be popped up to the user.

You could get a summary table by:

In [n]: jobs

or a summary table for subjobs:

In [n]: j.subjobs

Killing and removing jobs

You can kill a job by calling

In [n]: j.kill()

or remove a job by

In [n]: j.remove()

Results and output merging

For the moment, the completed (sub-)job returns an root summary file. The file is stored in the summary sub-directory in the job's output directory.

For jobs using StagerJobSplitter, the RootMerger is automatically assigned to the job so that when the whole job is completed, the summary root files from subjobs are merged together.

For jobs using DQ2Dataset, the merging process can be done manually when the whole job is completed. For example, assuming each subjob produces a root summary file called summary/summary_mySample_confFile_exampleaod.conf_nEvts_1000.root. To merge them, one can do:

In [n]: merger = RootMerger()
In [n]: merger.files += ['summary/summary_mySample_confFile_exampleaod.conf_nEvts_1000.root']
In [n]: merger.overwrite = True
In [n]: merger.ignorefailed = True
In [n]: merger.merge(j)

The merged root file with the same name will be created in the job's outputdir.

Advanced usage

More information

Known issues/ToDo items

  • StagerDataset not supported for jobs on LCG