Difference between revisions of "Produce and read microDSTs using the Grid"

From LHCb Wiki
Jump to navigation Jump to search
 
(45 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
== Producing microDSTs standalone ==
 
== Producing microDSTs standalone ==
  
''' Getting the relevant packages '''
+
''' Getting the relevant packages and setting the environment'''
  
 
<pre>
 
<pre>
Line 31: Line 31:
 
</pre>
 
</pre>
  
You should now see a .dst file in this directory which has just been created. For this standalone production of microDSTs take care that the following lines in the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/options/TestMicroDSTMake.py:
+
You should now see a .dst file in this directory which has just been created. For this standalone production of microDSTs take care that the following lines are set properly in the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/options/TestMicroDSTMake.py:
  
 
<pre>
 
<pre>
Line 42: Line 42:
 
== Producing microDSTs on the Grid ==
 
== Producing microDSTs on the Grid ==
  
''' Become a gRidder '''
+
First, follow the instructions on howto [[Become a Grid user]]
  
First obtain a Grid Certificate from the [http://ca.dutchgrid.nl/ Dutch Grid CA].
+
''' Configure Ganga '''
 +
 
 +
To setup Ganga, follow the instructions on [[Setting up Ganga for use at Nikhef]].
 +
Then start Ganga in a new shell:
 +
<pre>
 +
SetupProject Ganga
 +
ganga
 +
</pre>
 +
 
 +
Now we still have to configure the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job/MicroDST_Ganga.py
 +
 
 +
First set the correct DaVinci version in this line:
 +
 
 +
<pre>
 +
dv = DaVinci( version = 'vxxrxpx' )
 +
</pre>
 +
 
 +
Set in ganga the optionsfile that writes your mDSTs:
 +
<pre>
 +
dv.optsfile = 'optionfile.py'
 +
</pre>
 +
 
 +
Initialize the job:
 +
<pre>
 +
j = Job( Application = dv, name='name' )
 +
</pre>
 +
 
 +
Set the name of the outputfile.
 +
<pre>
 +
j.outputdata = ['filename.dst']
 +
</pre>
 +
 
 +
QUESTION: DOES IT MATTER IF THIS ONE IS DIFFERENT THEN THE ONE IN YOUR OPTIONS FILE?
  
When you've done this, become a member of the LHCb VO by following this [https://lcg-voms.cern.ch:8443/vo/lhcb/vomrs link].  
+
Now choose the desired backend.  
  
To use the Grid from lxplus you need to copy your certificate to lxplus. You do this by copying the entire .globus directory to your home on lxplus.
+
'''Backend Local'''
  
''' Configure Ganga '''
+
To run jobs locally:
 +
 
 +
<pre>
 +
j.backend    = Local() 
 +
</pre>
  
To setup Ganga do
+
When all options are set correctly, submit the jobs typing:
  
 
<pre>
 
<pre>
GangaEnv
+
ganga MicroDST_Ganga.py
 
</pre>
 
</pre>
  
Now we still have to configure the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job/MicroDST_Ganga.py
+
The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output file(s) itself go to /castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata.
 +
 
 +
'''Backend Dirac'''
 +
 
 +
To run jobs on the Grid using Dirac:
 +
 
 +
<pre>
 +
j.backend    = Dirac() 
 +
</pre>
  
First set the correct DaVinci version in this line:
+
The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output is now a list of LFNs. To get a local copy (that is, in ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output) of the produced mDST start ganga and type
  
 
<pre>
 
<pre>
dv = DaVinci( version = 'v22r0p2' )
+
jobs(ganga jobnummer).backend.getOutputData()
 
</pre>
 
</pre>
  
Now choose the desired backend. To run jobs on the Grid uncomment
+
If you want a copy on a Grid SE like castor you type
  
 
<pre>
 
<pre>
#j.backend   = Dirac()
+
jobs(ganga jobnummer).backend.getOutputDataLFNs()
 
</pre>
 
</pre>
  
When all options are set correctly, submit the jobs:
+
to obtain the LFNs of the produced mDSTs. These can be used to make copies of the produced mDSTs on local or Grid SE.
  
WHAT ABOUT THE LFN/PFN issue at this point?
+
For a copy on a Grid SE, open a shell and type
  
 
<pre>
 
<pre>
ganga MicroDST_Ganga.py
+
SetupProject Dirac
 +
lhcb-proxy-init
 +
dirac-dms-replicate-lfn <LFN> CERN-USER
 
</pre>
 
</pre>
 +
 +
For storage on Castor you use CERN-USER, for storage on SARA-TIER1 you use NIKHEF-USER (IS THIS CORRECT????)
 +
 +
ALSO THIS SHOULD BE EASIER RIGHT? CAN'T I JUST TELL GANGA WHERE I WANT TO STORE THINGS?
 +
 +
 +
 +
 +
'''General comments'''
 +
 +
----
 +
TREAT HERE THE INPUTDATA SHIT WITH LFNs OF DSTs USING TRISTANS CODE SNIPPET
 +
----
  
 
If you work on a 64-bit machine (which you do when logged in on lxplus) you get an error message saying that only slc4_ia32_gcc34 is allowed on the Grid. So you should type
 
If you work on a 64-bit machine (which you do when logged in on lxplus) you get an error message saying that only slc4_ia32_gcc34 is allowed on the Grid. So you should type
Line 86: Line 145:
 
</pre>
 
</pre>
  
Now the jobs will be submitted. For some general Ganga commands take a look [https://twiki.cern.ch/twiki/bin/view/LHCb/GangaTutorial1 here].
+
----
  
Now the output of these jobs (in this case the microDSTs) will go to your ~/gangadir. But if you are generating a lot of possibly large files, it is more convenient to let Ganga send these files somewhere else.
+
For some general Ganga commands take a look [https://twiki.cern.ch/twiki/bin/view/LHCb/GangaTutorial1 here].
 +
 
 +
----
 +
 
 +
THE FOLLOWING ONE IS OUT OF DATE, BUT JUST TO SHOW THAT ~/.gangarc IS IMPORTANT:
  
 
If in your ~/.gangarc file you change the following line
 
If in your ~/.gangarc file you change the following line
Line 102: Line 165:
 
</pre>
 
</pre>
  
The MicroDSTStream does not go to your outputsandbox, but to 'dataoutput' which is the Grid SE (Storage Element), so in case of backend.Dirac() it is
+
the MicroDSTStream does not go to your outputsandbox, but to 'dataoutput' which is the Grid SE (Storage Element), so in case of backend.Dirac() it is
 +
 
 +
<pre>
 +
/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/<first four digits of jobID>/<jobID>
 +
</pre>
 +
 
 +
Where <tt><letter></tt> stands for the first letter in your Cern User ID <tt><UID></tt>.
 +
 
 +
While if you chose backend.Local() then 'dataoutput' is
 +
 
 +
<pre>
 +
/castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata
 +
</pre>
 +
 
 +
----
 +
 
 +
You find the jobID in the following way: In Ganga do
 +
 
 +
<pre>
 +
jobs(<gangajobnr>).backend.id
 +
</pre>
 +
 
 +
or in case you splitted the job in subjobs:
  
 
<pre>
 
<pre>
/castor/cern.ch/grid/lhcb/user/[first letter of username]/[username]
+
jobs("<gangajobnr.subjobnumber>").backend.id
 
</pre>
 
</pre>
  
While if you chose backend.Local() then it 'dataoutput' is
+
== Basic Castor commands ==
 +
 
 +
Here are some basic commands that you can use on Castor:
 +
 
 +
* Copy a file from a local directory to CASTOR:
 +
:<span style="color:Maroon;"><tt>rfcp MyFileName /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName</tt></span>
 +
 
 +
* Copy a file from CASTOR to a local directory:
 +
:<span style="color:Maroon;"><tt>rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName MyFileName</tt></span>
 +
 
 +
* Copy a file from one CASTOR location to another:
 +
:<span style="color:Maroon;"><tt>rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory1/MyFileName1 /castor/cern.ch/<letter>/<UID>/MyDirectory2/MyFileName2</tt></span>
 +
 
 +
* Delete a file from CASTOR (to delete a directory use option -r):
 +
:<span style="color:Maroon;"><tt>rfrm /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName</tt></span>
 +
 
 +
* Create a directory in CASTOR:
 +
:<span style="color:Maroon;"><tt>rfmkdir /castor/cern.ch/user/<letter>/<UID>/MyNewDirectory/</tt></span>
 +
 
 +
* Move a file in CASTOR:
 +
:<span style="color:Maroon;"><tt>rfrename /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyOldFileName /castor/cern.ch/<letter>/<UID>/MyDirectory/MyNewFileName</tt></span>
 +
 
 +
* List the contents of a CASTOR directory:
 +
:<span style="color:Maroon;"><tt>rfdir /castor/cern.ch/user/<letter>/<UID>/MyDirectory</tt></span>
 +
 
 +
or
 +
:<span style="color:Maroon;"><tt>nsls -l /castor/cern.ch/user/<letter>/<UID>/MyDirectory</tt></span>
 +
 
 +
== Copying microDSTs from Grid to Sara Tier1 ==
 +
 
 +
First we want to make a directory at the Sara Tier1 where we can store our files.
 +
 
 +
Find Gerhard's script
 +
 
 +
then your grid script
 +
 
 +
then rfmkdir
 +
 
 +
Now we can copy the files from Castor to the LHCb storage on the Sara Amsterdam Tier1. We can do this either by starting the Dirac environment (CHECKTHIS!)
 +
 
 +
Or by manually entering the following that enabels us to use the <tt>lcg-cp</tt> command:
 +
 
 +
Do:
 +
 
 +
<pre>
 +
lhcb-proxy-init
 +
source /afs/cern.ch/project/gd/LCG-Share/current/etc/profile.d/grid_env.csh
 +
</pre>
 +
 
 +
Now you can use the lcg-commands. Look [http://ppewww.physics.gla.ac.uk/~fergusjk/howtolcg.html here] for an explanation of some lcg commands.
 +
 
 +
For copying use:
  
 
<pre>
 
<pre>
/castor/cern.ch/user/[first letter of username]/[username]
+
lcg-cp -v srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/MyFileName
 +
srm://tbn18.nikhef.nl/dpm/nikhef.nl/home/lhcb/<username>/MyFileName
 
</pre>
 
</pre>
  
== Copying microDSTs from Grid to Sara ==
+
The directory
  
 
== Stageing the Sara microDSTs on Stoomboot ==
 
== Stageing the Sara microDSTs on Stoomboot ==

Latest revision as of 19:06, 27 March 2010

Producing microDSTs standalone

Getting the relevant packages and setting the environment

SetupProject DaVinci v22r0p2 --build-env
SetupProject DaVinci v22r0p2

getpack Phys/DaVinci v22r0p2
getpack PhysSel/Ccbar
getpack Ex/MicroDSTExample

cd ${User_release_area}/DaVinci_v22r0p2/Phys/DaVinci_v22r0p2/Phys/DaVinci/cmt
gmake

cd ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/cmt
gmake

cd ${User_release_area}/DaVinci_v22r0p2/PhysSel/Ccbar/cmt
gmake

Making the microDST

Do

cd ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job
gaudirun.py ../options/TestMicroDSTMake.py
ls -ltr

You should now see a .dst file in this directory which has just been created. For this standalone production of microDSTs take care that the following lines are set properly in the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/options/TestMicroDSTMake.py:

#importOptions( "$MICRODSTEXAMPLEROOT/options/JpsiPhiDataLFN.py")                                                                                            
importOptions( "$MICRODSTEXAMPLEROOT/options/JpsiPhiDataPFN.py")

This takes care of using the PFN (physical file name) this time instead of the LFN (logical file name), which we will use later working with the Grid.

Producing microDSTs on the Grid

First, follow the instructions on howto Become a Grid user

Configure Ganga

To setup Ganga, follow the instructions on Setting up Ganga for use at Nikhef. Then start Ganga in a new shell:

SetupProject Ganga
ganga

Now we still have to configure the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job/MicroDST_Ganga.py

First set the correct DaVinci version in this line:

dv = DaVinci( version = 'vxxrxpx' )

Set in ganga the optionsfile that writes your mDSTs:

dv.optsfile = 'optionfile.py'

Initialize the job:

j = Job( Application = dv, name='name' )

Set the name of the outputfile.

j.outputdata = ['filename.dst']

QUESTION: DOES IT MATTER IF THIS ONE IS DIFFERENT THEN THE ONE IN YOUR OPTIONS FILE?

Now choose the desired backend.

Backend Local

To run jobs locally:

j.backend    = Local()  

When all options are set correctly, submit the jobs typing:

ganga MicroDST_Ganga.py

The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output file(s) itself go to /castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata.

Backend Dirac

To run jobs on the Grid using Dirac:

j.backend    = Dirac()  

The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output is now a list of LFNs. To get a local copy (that is, in ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output) of the produced mDST start ganga and type

jobs(ganga jobnummer).backend.getOutputData()

If you want a copy on a Grid SE like castor you type

jobs(ganga jobnummer).backend.getOutputDataLFNs()

to obtain the LFNs of the produced mDSTs. These can be used to make copies of the produced mDSTs on local or Grid SE.

For a copy on a Grid SE, open a shell and type

SetupProject Dirac
lhcb-proxy-init
dirac-dms-replicate-lfn <LFN> CERN-USER

For storage on Castor you use CERN-USER, for storage on SARA-TIER1 you use NIKHEF-USER (IS THIS CORRECT????)

ALSO THIS SHOULD BE EASIER RIGHT? CAN'T I JUST TELL GANGA WHERE I WANT TO STORE THINGS?



General comments


TREAT HERE THE INPUTDATA SHIT WITH LFNs OF DSTs USING TRISTANS CODE SNIPPET


If you work on a 64-bit machine (which you do when logged in on lxplus) you get an error message saying that only slc4_ia32_gcc34 is allowed on the Grid. So you should type

setenv CMTCONFIG slc4_ia32_gcc34

For some general Ganga commands take a look here.


THE FOLLOWING ONE IS OUT OF DATE, BUT JUST TO SHOW THAT ~/.gangarc IS IMPORTANT:

If in your ~/.gangarc file you change the following line

outputsandbox_types = ['NTupleSvc', 'HistogramPersistencySvc', 'MicroDSTStream\']

into this one

outputsandbox_types = []

the MicroDSTStream does not go to your outputsandbox, but to 'dataoutput' which is the Grid SE (Storage Element), so in case of backend.Dirac() it is

/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/<first four digits of jobID>/<jobID>

Where <letter> stands for the first letter in your Cern User ID <UID>.

While if you chose backend.Local() then 'dataoutput' is

/castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata

You find the jobID in the following way: In Ganga do

jobs(<gangajobnr>).backend.id

or in case you splitted the job in subjobs:

jobs("<gangajobnr.subjobnumber>").backend.id

Basic Castor commands

Here are some basic commands that you can use on Castor:

  • Copy a file from a local directory to CASTOR:
rfcp MyFileName /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName
  • Copy a file from CASTOR to a local directory:
rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName MyFileName
  • Copy a file from one CASTOR location to another:
rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory1/MyFileName1 /castor/cern.ch/<letter>/<UID>/MyDirectory2/MyFileName2
  • Delete a file from CASTOR (to delete a directory use option -r):
rfrm /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName
  • Create a directory in CASTOR:
rfmkdir /castor/cern.ch/user/<letter>/<UID>/MyNewDirectory/
  • Move a file in CASTOR:
rfrename /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyOldFileName /castor/cern.ch/<letter>/<UID>/MyDirectory/MyNewFileName
  • List the contents of a CASTOR directory:
rfdir /castor/cern.ch/user/<letter>/<UID>/MyDirectory

or

nsls -l /castor/cern.ch/user/<letter>/<UID>/MyDirectory

Copying microDSTs from Grid to Sara Tier1

First we want to make a directory at the Sara Tier1 where we can store our files.

Find Gerhard's script

then your grid script

then rfmkdir

Now we can copy the files from Castor to the LHCb storage on the Sara Amsterdam Tier1. We can do this either by starting the Dirac environment (CHECKTHIS!)

Or by manually entering the following that enabels us to use the lcg-cp command:

Do:

lhcb-proxy-init
source /afs/cern.ch/project/gd/LCG-Share/current/etc/profile.d/grid_env.csh

Now you can use the lcg-commands. Look here for an explanation of some lcg commands.

For copying use:

lcg-cp -v srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/MyFileName
srm://tbn18.nikhef.nl/dpm/nikhef.nl/home/lhcb/<username>/MyFileName

The directory

Stageing the Sara microDSTs on Stoomboot

Reading microDSTs

microDSTReadingExample.py

of maak wat nieuws in Bender

Analysis

P2VV

getpack PhysFit/P2VV getpack PhysFit/P2VVPython

and use the RooFit based fit package