FullChain on the grid
A specific example
We'll describe here an example where we'll process the full chain GENERATION-SIMULATION-DIGITIZATION-RECONSTRUCTION-ESDtoAOD for
where you can pick your favorite Higgs mass and Z decay channel. This exercise also allows you to test the script on your local ATLAS setup. First make sure this runs before submitting 100's of jobs onto the grid.
In the process we POOL output for each step of the chain will be save ON THE GRID with the following names:
gen_Higgs_ZZ_XXYY_XATHENAVERSIONX_JobXJOBNUMBERX.pool.root hits_Higgs_ZZ_XXYY_XATHENAVERSIONX_JobXJOBNUMBERX.pool.root digits_Higgs_ZZ_XXYY_XATHENAVERSIONX_JobXJOBNUMBERX.pool.root esd_Higgs_ZZ_XXYY_XATHENAVERSIONX_JobXJOBNUMBERX.pool.root aod_Higgs_ZZ_XXYY_XATHENAVERSIONX_JobXJOBNUMBERX.pool.root
Where XATHENAVERSIONX is the athena version and XJOBNUMBERX is the job number
To Change the name of the output Higgs_ZZ_XXYY go to the toShipOff_Everything and change
ChannelName = "Higgs_ZZ_XXYY"
with the name you want
In the ShellScript_BASIC.sh you can add -d SE (Storage Element) to change the default value i.e
lcg-cr --vo atlas -d tbn15.nikhef.nl -l lfn:filenameONTHEGRID file://${PWD}/filenameLOCALLY
The necessary (9) files
For each ATLAS job op the grid we'll need the following files:
- 1) JobOptions Files:
- 1.1) A joboptions file for our Generation job Generation_jobOptions_BASIC.py
- (Here you specify the physics process and details of the output. In our case: Run pythia, and produce a POOL output file)
- 1.2) A joboptions file for our Simulation job Simulation_jobOptions_BASIC.py
- (Here you specify some features about the simulation)
- 1.3) A joboptions file for our Digitization job Digitization_jobOptions_BASIC.py
- (Here you specify some features about the digitization)
- 1.4) A joboptions file for our Reconstruction job Recontruction_jobOptions_BASIC.py
- (Here you specify some features about the recontruction, the output is a ESD file)
- 1.5) A joboptions file for our ESDtoAOD job ESDtoAOD_jobOptions_BASIC.py
- (Here you specify some features about the ESDtoAOD, the output is a EOD file)
- 1.1) A joboptions file for our Generation job Generation_jobOptions_BASIC.py
- 2) A shell script that will run on the remote grid machine ShellScript_BASIC.sh
- (The ATLAS settings will be set and the athena job will be started by this script)
- 3) A JDL file containing the names of all required input and output file jdl_BASIC.jdl
- 4) A tar-ball with Atlas software AtlasStuff.tgz
- This has to be chosen acordingly with the Athena version you want to run. Note as well that there are to files RomeGeo2G4.py and RecExCommon_topOptions.py; which you might have to change for others, or modify, when using other versions of athena rather than 10.0.2
- fieldmap.dat contains information about the magnetic field (necesary for simulation)
To facilitate the handling of a large number of jobs we have added two more scripts
- 5) A script that produces all input files: toShipOff_Everything.py
- 6) A general tool from Wouter to manage your jobs on the grid: gridmgr
On your local machine please download these files into a single directory.
The main script (Higgs_ShipOff_Everything.py)
This main task from this script is easily illustrated by the routines that are called for each job:
Create_Generation_File() # create joboptions file Create_Simulation_File() # create joboptions file Create_Digitization_File() # create joboptions file Create_Reconstruction_File() # create joboptions file Create_ESDtoAOD_File() # create joboptions file Create_JDL_File() # create jdl file Create_Shell_Script() # craete shell script Submit_Job_To_Grid() # submit job onto the grid Cleanup_InputFiles() # save input files
For each job a unique joboptions file, a unique JDL-file and a unique shell script are produced. Then the job is submitted (locally or on the grid) and finally the input files are stored into a separate directory. As input you give the number of jobs and number of events per job followed by a RunType which specifies if you want to run locally or on the grid. If you want to submit onto the grid you need to be on a User Interface machine (At NIKHEF this is for example ui03.nikhef.nl).
How to run. An example:
Submitting a single job skipping 0 jobs with 50 events locally with athena 10.0.2: Higgs_ShipOff_Everything.py 1 0 50 0 10.0.2 Submitting 20 jobs skiping the first 10 with 5000 events on the grid with athena 10.0.2: Higgs_ShipOff_Everything.py 20 10 50 1 10.0.2 Note: If you choose a different verion of athena you'll have to be sure you're sending along the piece of code that correspond to that distribution in AtlasStuff.tgz.
What input files are produced for each job
If you uncomment the 'submit' line in the script (#Submit_Job_To_Grid()) you can check what is produced without running an actual job. In the directory InputFiles_Job1' you'll find the three input files: ShellScript_Higgs_Job1.sh, joboptions_Higgs_Job1.py and jdl_Higgs_Job1.jdl.
Checking what is happening on the grid (gridmgr)
Using Wouter's tool it is now much easier to follow what is happening to your jobs on the grid.
Check status of all your jobs: ./gridmgr status -a
What output files are produced for each job
Once the job has finished retrieve the output from the job as follows:
Retrieve output for job 1: ./gridmgr retrieve --dir . <Job1>
In a cryptic directory we now find (apart from the standard input and error files):
"Generation_joboptions_XATHENAVERSIONX_JobXXJobNrXX.py", "Generation_XATHENAVERSIONX_JobXXJobNrXX.log" , "Simulation_joboptions_XATHENAVERSIONX_JobXXJobNrXX.py", "Simulation_XATHENAVERSIONX_JobXXJobNrXX.log", "Digitization_joboptions_XATHENAVERSIONX_JobXXJobNrXX.py", "Digitization_XATHENAVERSIONX_JobXXJobNrXX.log", "Recosntruction_joboptions_XATHENAVERSIONX_JobXXJobNrXX.py", "Reconstruction_XATHENAVERSIONX_JobXXJobNrXX.log", "ESDtoAOD_joboptions_XATHENAVERSIONX_JobXXJobNrXX.py", "ESDtoAOD_XATHENAVERSIONX_JobXXJobNrXX.log" With XATHENAVERIONX and XXJobNrXX corresponding to your previous choices
Where are my files?
Unless you've change to a different SE (Storage Element) the files will be in the default SE on the grid. Now you can retrieve them to a local machine usig
lcg-cp --vo atlas lfn:<filename> file://<fullpath>/filename