Difference between revisions of "Generating Higgs Events on the grid"

From Atlas Wiki
Jump to navigation Jump to search
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
== A specific example ==
 
== A specific example ==
We'll describe here an example where we'll generate <math>H_{\mu}\rightarrow ZZ \rightarrow XXYY</math> where you can pick your favorite Higgs mass and Z decay channel. This exercise also allows you to test the script on your local ATLAS setup. First make sure this runs before submitting 100's of jobs onto the grid.
+
We'll describe here an example where we'll generate <math>H\rightarrow ZZ \rightarrow XXYY</math> where you can pick your favorite Higgs mass and Z decay channel. This exercise also allows you to test the script on your local ATLAS setup. First make sure this runs before submitting 100's of jobs onto the grid.
  
  
Line 11: Line 11:
 
:3) A JDL file containing the names of all required input and output file [http://www.nikhef.nl/~ivov/HiggsGrid_Scripts/jdl_Higgs_BASIC.jdl jdl_Higgs_BASIC.jdl]
 
:3) A JDL file containing the names of all required input and output file [http://www.nikhef.nl/~ivov/HiggsGrid_Scripts/jdl_Higgs_BASIC.jdl jdl_Higgs_BASIC.jdl]
 
:4) A tar-ball with ATlas software [http://www.nikhef.nl/~ivov/HiggsGrid_Scripts/AtlasStuff.tgz AtlasStuff.tgz]
 
:4) A tar-ball with ATlas software [http://www.nikhef.nl/~ivov/HiggsGrid_Scripts/AtlasStuff.tgz AtlasStuff.tgz]
 
  
 
To facilitate the handling of a large number of jobs we have added two more scripts
 
To facilitate the handling of a large number of jobs we have added two more scripts
Line 31: Line 30:
 
For each job a unique joboptions file, a unique JDL-file and a unique shell script are produced. Then the job is submitted (locally or on the grid) and finally the input files are stored into a separate directory. As input you give the number of jobs and number of events per job followed by a RunType which specifies if you want to run locally or on the grid. If you want to submit onto the grid you need to be on a User Interface machine (At NIKHEF this is for example ui03.nikhef.nl).
 
For each job a unique joboptions file, a unique JDL-file and a unique shell script are produced. Then the job is submitted (locally or on the grid) and finally the input files are stored into a separate directory. As input you give the number of jobs and number of events per job followed by a RunType which specifies if you want to run locally or on the grid. If you want to submit onto the grid you need to be on a User Interface machine (At NIKHEF this is for example ui03.nikhef.nl).
  
"""Example:"""
+
<b>How to run. An example:</b>
Submitting a single job with 50 events locally:  Higgs_ShipOff_Everything.py 1 50 0
+
<pre>
Submitting 20 jobs with 5000 events on the grid: Higgs_ShipOff_Everything.py 20 5000 1
+
  Submitting a single job with 50 events locally:  Higgs_ShipOff_Everything.py 1 50 0
 +
  Submitting 20 jobs with 5000 events on the grid: Higgs_ShipOff_Everything.py 20 5000 1
 +
</pre>
 +
 
 +
== What input files are produced for each job ==
 +
If you uncomment the 'submit' line in the script (#Submit_Job_To_Grid()) you can check what is produced without running an actual job.
 +
In the directory InputFiles_Job1' you'll find the three input files: ShellScript_Higgs_Job1.sh, joboptions_Higgs_Job1.py and jdl_Higgs_Job1.jdl.
 +
 
 +
<font color=red><b>Note:</b></font> If you plan to produce many files you are adviced to store your files on the grid instead of a local disk on the UI machines. Read about the changes you need to make on the web-page by Gustavo: [http://www.nikhef.nl/pub/experiments/atlaswiki/index.php/FullChain_on_the_grid FullChain_on_the_grid]
  
 
== Checking what is happening on the grid (gridmgr) ==
 
== Checking what is happening on the grid (gridmgr) ==
Line 40: Line 47:
 
<pre>
 
<pre>
 
   Check status of all your jobs:  ./gridmgr  status -a     
 
   Check status of all your jobs:  ./gridmgr  status -a     
 +
</pre>
 +
 +
== What output files are produced for each job ==
 +
 +
Once the job has finished retrieve the output from the job as follows:
 +
<pre>
 
   Retrieve output for job 1:      ./gridmgr retrieve --dir . <Job1>
 
   Retrieve output for job 1:      ./gridmgr retrieve --dir . <Job1>
 
</pre>
 
</pre>
 +
 +
In a cryptic directory we now find (apart from the standard input and error files):
 +
<pre>
 +
  Higgs.CBNT.Job1.root
 +
  Logfile_Higgs_Job1.log
 +
</pre>
 +
 +
 +
== Changing things: Other Generator/physics process, fast/full simulation, AOD/CBNT .... etc. etc. ===
 +
 +
This control from the job is in the standard joboptions file. To change the algorithms you'll need to change this file.
 +
The settings for this example (AtlFast+CBNT) were taken from [http://www.hep.ucl.ac.uk/atlas/atlfast/RunningAtlfast.html RunningAtlfast], but you can also find there settings for AOD output.
 +
 +
Producing full simulation events you might have to submit more ATLAS software. The TestRelease package is not sufficient and you'll need to tar all the code you'd like to use in a new tar-ball (called AtlasStuff.tgz). Note that all code is build on the remote machine.

Latest revision as of 12:44, 7 November 2005

A specific example

We'll describe here an example where we'll generate where you can pick your favorite Higgs mass and Z decay channel. This exercise also allows you to test the script on your local ATLAS setup. First make sure this runs before submitting 100's of jobs onto the grid.


The necessary (6) files

For each ATLAS job op the grid we'll need the following files:

1) A joboptions file for our Athena job joboptions_Higgs_BASIC.py
(Here you specify the physics process and details of the output. In our case: Run pythia, atlfast and produce a CBNT output file)
2) A shell script that will run on the remote grid machine ShellScript_Higgs_BASIC.sh
(The ATLAS settings will be set and the athena job will be started by this script)
3) A JDL file containing the names of all required input and output file jdl_Higgs_BASIC.jdl
4) A tar-ball with ATlas software AtlasStuff.tgz

To facilitate the handling of a large number of jobs we have added two more scripts

5) A script that produces all input files: Higgs_ShipOff_Everything.py
6) A general tool from Wouter to manage your jobs on the grid: gridmgr

On your local machine please download these files into a single directory.

The main script (Higgs_ShipOff_Everything.py)

This main task from this script is easily illustrated by the routines that are called for each job:

    Create_Joboptions_File()                 # create joboptions file
    Create_JDL_File()                        # create jdl file
    Create_Shell_Script()                    # craete shell script
    Submit_Job_To_Grid()                     # submit job onto the grid
    Cleanup_InputFiles()                     # save input files

For each job a unique joboptions file, a unique JDL-file and a unique shell script are produced. Then the job is submitted (locally or on the grid) and finally the input files are stored into a separate directory. As input you give the number of jobs and number of events per job followed by a RunType which specifies if you want to run locally or on the grid. If you want to submit onto the grid you need to be on a User Interface machine (At NIKHEF this is for example ui03.nikhef.nl).

How to run. An example:

  Submitting a single job with 50 events locally:  Higgs_ShipOff_Everything.py 1 50 0
  Submitting 20 jobs with 5000 events on the grid: Higgs_ShipOff_Everything.py 20 5000 1

What input files are produced for each job

If you uncomment the 'submit' line in the script (#Submit_Job_To_Grid()) you can check what is produced without running an actual job. In the directory InputFiles_Job1' you'll find the three input files: ShellScript_Higgs_Job1.sh, joboptions_Higgs_Job1.py and jdl_Higgs_Job1.jdl.

Note: If you plan to produce many files you are adviced to store your files on the grid instead of a local disk on the UI machines. Read about the changes you need to make on the web-page by Gustavo: FullChain_on_the_grid

Checking what is happening on the grid (gridmgr)

Using Wouter's tool it is now much easier to follow what is happening to your jobs on the grid.

  Check status of all your jobs:  ./gridmgr  status -a     

What output files are produced for each job

Once the job has finished retrieve the output from the job as follows:

  Retrieve output for job 1:      ./gridmgr retrieve --dir . <Job1>

In a cryptic directory we now find (apart from the standard input and error files):

  Higgs.CBNT.Job1.root
  Logfile_Higgs_Job1.log


Changing things: Other Generator/physics process, fast/full simulation, AOD/CBNT .... etc. etc. =

This control from the job is in the standard joboptions file. To change the algorithms you'll need to change this file. The settings for this example (AtlFast+CBNT) were taken from RunningAtlfast, but you can also find there settings for AOD output.

Producing full simulation events you might have to submit more ATLAS software. The TestRelease package is not sufficient and you'll need to tar all the code you'd like to use in a new tar-ball (called AtlasStuff.tgz). Note that all code is build on the remote machine.