Difference between revisions of "Produce and read microDSTs using the Grid"

From LHCb Wiki
Jump to navigation Jump to search
 
(51 intermediate revisions by one other user not shown)
Line 1: Line 1:
 
== Producing microDSTs standalone ==
 
== Producing microDSTs standalone ==
  
''' Getting the relevant packages '''
+
''' Getting the relevant packages and setting the environment'''
  
 
<pre>
 
<pre>
 
SetupProject DaVinci v22r0p2 --build-env
 
SetupProject DaVinci v22r0p2 --build-env
 +
SetupProject DaVinci v22r0p2
  
 
getpack Phys/DaVinci v22r0p2
 
getpack Phys/DaVinci v22r0p2
Line 10: Line 11:
 
getpack Ex/MicroDSTExample
 
getpack Ex/MicroDSTExample
  
cd ${User_release_area}/Phys/DaVinci_v22r0p2/Phys/DaVinci/cmt
+
cd ${User_release_area}/DaVinci_v22r0p2/Phys/DaVinci_v22r0p2/Phys/DaVinci/cmt
 
gmake
 
gmake
  
cd ${User_release_area}/Ex/MicroDSTExample/cmt
+
cd ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/cmt
 
gmake
 
gmake
  
cd ${User_release_area}/PhysSel/Ccbar/cmt
+
cd ${User_release_area}/DaVinci_v22r0p2/PhysSel/Ccbar/cmt
 
gmake
 
gmake
 
</pre>
 
</pre>
  
 +
''' Making the microDST '''
  
job> gaudirun.py ../options/TestMicroDst.py
+
Do
 +
 
 +
<pre>
 +
cd ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job
 +
gaudirun.py ../options/TestMicroDSTMake.py
 +
ls -ltr
 +
</pre>
 +
 
 +
You should now see a .dst file in this directory which has just been created. For this standalone production of microDSTs take care that the following lines are set properly in the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/options/TestMicroDSTMake.py:
 +
 
 +
<pre>
 +
#importOptions( "$MICRODSTEXAMPLEROOT/options/JpsiPhiDataLFN.py")                                                                                           
 +
importOptions( "$MICRODSTEXAMPLEROOT/options/JpsiPhiDataPFN.py")
 +
</pre>
 +
 
 +
This takes care of using the PFN (physical file name) this time instead of the LFN (logical file name), which we will use later working with the Grid.
  
 
== Producing microDSTs on the Grid ==
 
== Producing microDSTs on the Grid ==
  
''' Become a gRidder '''
+
First, follow the instructions on howto [[Become a Grid user]]
 +
 
 +
''' Configure Ganga '''
 +
 
 +
To setup Ganga, follow the instructions on [[Setting up Ganga for use at Nikhef]].
 +
Then start Ganga in a new shell:
 +
<pre>
 +
SetupProject Ganga
 +
ganga
 +
</pre>
 +
 
 +
Now we still have to configure the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job/MicroDST_Ganga.py
 +
 
 +
First set the correct DaVinci version in this line:
 +
 
 +
<pre>
 +
dv = DaVinci( version = 'vxxrxpx' )
 +
</pre>
 +
 
 +
Set in ganga the optionsfile that writes your mDSTs:
 +
<pre>
 +
dv.optsfile = 'optionfile.py'
 +
</pre>
 +
 
 +
Initialize the job:
 +
<pre>
 +
j = Job( Application = dv, name='name' )
 +
</pre>
 +
 
 +
Set the name of the outputfile.
 +
<pre>
 +
j.outputdata = ['filename.dst']
 +
</pre>
 +
 
 +
QUESTION: DOES IT MATTER IF THIS ONE IS DIFFERENT THEN THE ONE IN YOUR OPTIONS FILE?
 +
 
 +
Now choose the desired backend.
 +
 
 +
'''Backend Local'''
 +
 
 +
To run jobs locally:
 +
 
 +
<pre>
 +
j.backend    = Local() 
 +
</pre>
 +
 
 +
When all options are set correctly, submit the jobs typing:
 +
 
 +
<pre>
 +
ganga MicroDST_Ganga.py
 +
</pre>
 +
 
 +
The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output file(s) itself go to /castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata.
 +
 
 +
'''Backend Dirac'''
 +
 
 +
To run jobs on the Grid using Dirac:
 +
 
 +
<pre>
 +
j.backend    = Dirac() 
 +
</pre>
 +
 
 +
The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output is now a list of LFNs. To get a local copy (that is, in ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output) of the produced mDST start ganga and type
 +
 
 +
<pre>
 +
jobs(ganga jobnummer).backend.getOutputData()
 +
</pre>
 +
 
 +
If you want a copy on a Grid SE like castor you type
 +
 
 +
<pre>
 +
jobs(ganga jobnummer).backend.getOutputDataLFNs()
 +
</pre>
 +
 
 +
to obtain the LFNs of the produced mDSTs. These can be used to make copies of the produced mDSTs on local or Grid SE.
 +
 
 +
For a copy on a Grid SE, open a shell and type
 +
 
 +
<pre>
 +
SetupProject Dirac
 +
lhcb-proxy-init
 +
dirac-dms-replicate-lfn <LFN> CERN-USER
 +
</pre>
 +
 
 +
For storage on Castor you use CERN-USER, for storage on SARA-TIER1 you use NIKHEF-USER (IS THIS CORRECT????)
 +
 
 +
ALSO THIS SHOULD BE EASIER RIGHT? CAN'T I JUST TELL GANGA WHERE I WANT TO STORE THINGS?
 +
 
 +
 
 +
 
 +
 
 +
'''General comments'''
 +
 
 +
----
 +
TREAT HERE THE INPUTDATA SHIT WITH LFNs OF DSTs USING TRISTANS CODE SNIPPET
 +
----
 +
 
 +
If you work on a 64-bit machine (which you do when logged in on lxplus) you get an error message saying that only slc4_ia32_gcc34 is allowed on the Grid. So you should type
 +
 
 +
<pre>
 +
setenv CMTCONFIG slc4_ia32_gcc34
 +
</pre>
 +
 
 +
----
 +
 
 +
For some general Ganga commands take a look [https://twiki.cern.ch/twiki/bin/view/LHCb/GangaTutorial1 here].
 +
 
 +
----
 +
 
 +
THE FOLLOWING ONE IS OUT OF DATE, BUT JUST TO SHOW THAT ~/.gangarc IS IMPORTANT:
 +
 
 +
If in your ~/.gangarc file you change the following line
 +
 
 +
<pre>
 +
outputsandbox_types = ['NTupleSvc', 'HistogramPersistencySvc', 'MicroDSTStream\']
 +
</pre>
 +
 
 +
into this one
 +
 
 +
<pre>
 +
outputsandbox_types = []
 +
</pre>
  
First obtain a Grid Certificate from the [http://ca.dutchgrid.nl/ Dutch Grid CA]
+
the MicroDSTStream does not go to your outputsandbox, but to 'dataoutput' which is the Grid SE (Storage Element), so in case of backend.Dirac() it is
  
When you've done this, become a member of the LHCb VO by following this [https://lcg-voms.cern.ch:8443/vo/lhcb/vomrs link].
+
<pre>
 +
/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/<first four digits of jobID>/<jobID>
 +
</pre>
  
To use the Grid from lxplus you need to copy your certificate to lxplus. You do this by copying the entire .globus directory to your home on lxplus.
+
Where <tt><letter></tt> stands for the first letter in your Cern User ID <tt><UID></tt>.
  
job> ganga MicriDST_Ganga.py
+
While if you chose backend.Local() then 'dataoutput' is
  
note: set
+
<pre>
*setenv CMTCONFIG 'slc_ia34_...'  # no 64 bit machines on Grid
+
/castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata
*job.backend() = Dirac()          # use Dirac as backend
+
</pre>
*job.outputsandbox = []          # let the uDSTs go to 'dataoutput', so Grid SE (Storage Element)
+
 
 +
----
 +
 
 +
You find the jobID in the following way: In Ganga do
 +
 
 +
<pre>
 +
jobs(<gangajobnr>).backend.id
 +
</pre>
 +
 
 +
or in case you splitted the job in subjobs:
 +
 
 +
<pre>
 +
jobs("<gangajobnr.subjobnumber>").backend.id
 +
</pre>
 +
 
 +
== Basic Castor commands ==
 +
 
 +
Here are some basic commands that you can use on Castor:
 +
 
 +
* Copy a file from a local directory to CASTOR:
 +
:<span style="color:Maroon;"><tt>rfcp MyFileName /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName</tt></span>
 +
 
 +
* Copy a file from CASTOR to a local directory:
 +
:<span style="color:Maroon;"><tt>rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName MyFileName</tt></span>
 +
 
 +
* Copy a file from one CASTOR location to another:
 +
:<span style="color:Maroon;"><tt>rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory1/MyFileName1 /castor/cern.ch/<letter>/<UID>/MyDirectory2/MyFileName2</tt></span>
 +
 
 +
* Delete a file from CASTOR (to delete a directory use option -r):
 +
:<span style="color:Maroon;"><tt>rfrm /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName</tt></span>
 +
 
 +
* Create a directory in CASTOR:
 +
:<span style="color:Maroon;"><tt>rfmkdir /castor/cern.ch/user/<letter>/<UID>/MyNewDirectory/</tt></span>
 +
 
 +
* Move a file in CASTOR:
 +
:<span style="color:Maroon;"><tt>rfrename /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyOldFileName /castor/cern.ch/<letter>/<UID>/MyDirectory/MyNewFileName</tt></span>
 +
 
 +
* List the contents of a CASTOR directory:
 +
:<span style="color:Maroon;"><tt>rfdir /castor/cern.ch/user/<letter>/<UID>/MyDirectory</tt></span>
 +
 
 +
or
 +
:<span style="color:Maroon;"><tt>nsls -l /castor/cern.ch/user/<letter>/<UID>/MyDirectory</tt></span>
 +
 
 +
== Copying microDSTs from Grid to Sara Tier1 ==
 +
 
 +
First we want to make a directory at the Sara Tier1 where we can store our files.
 +
 
 +
Find Gerhard's script
 +
 
 +
then your grid script
 +
 
 +
then rfmkdir
 +
 
 +
Now we can copy the files from Castor to the LHCb storage on the Sara Amsterdam Tier1. We can do this either by starting the Dirac environment (CHECKTHIS!)
 +
 
 +
Or by manually entering the following that enabels us to use the <tt>lcg-cp</tt> command:
 +
 
 +
Do:
 +
 
 +
<pre>
 +
lhcb-proxy-init
 +
source /afs/cern.ch/project/gd/LCG-Share/current/etc/profile.d/grid_env.csh
 +
</pre>
 +
 
 +
Now you can use the lcg-commands. Look [http://ppewww.physics.gla.ac.uk/~fergusjk/howtolcg.html here] for an explanation of some lcg commands.
 +
 
 +
For copying use:
 +
 
 +
<pre>
 +
lcg-cp -v srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/MyFileName
 +
srm://tbn18.nikhef.nl/dpm/nikhef.nl/home/lhcb/<username>/MyFileName
 +
</pre>
  
== Copying microDSTs from Grid to Sara ==
+
The directory
  
 
== Stageing the Sara microDSTs on Stoomboot ==
 
== Stageing the Sara microDSTs on Stoomboot ==

Latest revision as of 19:06, 27 March 2010

Producing microDSTs standalone

Getting the relevant packages and setting the environment

SetupProject DaVinci v22r0p2 --build-env
SetupProject DaVinci v22r0p2

getpack Phys/DaVinci v22r0p2
getpack PhysSel/Ccbar
getpack Ex/MicroDSTExample

cd ${User_release_area}/DaVinci_v22r0p2/Phys/DaVinci_v22r0p2/Phys/DaVinci/cmt
gmake

cd ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/cmt
gmake

cd ${User_release_area}/DaVinci_v22r0p2/PhysSel/Ccbar/cmt
gmake

Making the microDST

Do

cd ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job
gaudirun.py ../options/TestMicroDSTMake.py
ls -ltr

You should now see a .dst file in this directory which has just been created. For this standalone production of microDSTs take care that the following lines are set properly in the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/options/TestMicroDSTMake.py:

#importOptions( "$MICRODSTEXAMPLEROOT/options/JpsiPhiDataLFN.py")                                                                                            
importOptions( "$MICRODSTEXAMPLEROOT/options/JpsiPhiDataPFN.py")

This takes care of using the PFN (physical file name) this time instead of the LFN (logical file name), which we will use later working with the Grid.

Producing microDSTs on the Grid

First, follow the instructions on howto Become a Grid user

Configure Ganga

To setup Ganga, follow the instructions on Setting up Ganga for use at Nikhef. Then start Ganga in a new shell:

SetupProject Ganga
ganga

Now we still have to configure the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job/MicroDST_Ganga.py

First set the correct DaVinci version in this line:

dv = DaVinci( version = 'vxxrxpx' )

Set in ganga the optionsfile that writes your mDSTs:

dv.optsfile = 'optionfile.py'

Initialize the job:

j = Job( Application = dv, name='name' )

Set the name of the outputfile.

j.outputdata = ['filename.dst']

QUESTION: DOES IT MATTER IF THIS ONE IS DIFFERENT THEN THE ONE IN YOUR OPTIONS FILE?

Now choose the desired backend.

Backend Local

To run jobs locally:

j.backend    = Local()  

When all options are set correctly, submit the jobs typing:

ganga MicroDST_Ganga.py

The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output file(s) itself go to /castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata.

Backend Dirac

To run jobs on the Grid using Dirac:

j.backend    = Dirac()  

The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output is now a list of LFNs. To get a local copy (that is, in ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output) of the produced mDST start ganga and type

jobs(ganga jobnummer).backend.getOutputData()

If you want a copy on a Grid SE like castor you type

jobs(ganga jobnummer).backend.getOutputDataLFNs()

to obtain the LFNs of the produced mDSTs. These can be used to make copies of the produced mDSTs on local or Grid SE.

For a copy on a Grid SE, open a shell and type

SetupProject Dirac
lhcb-proxy-init
dirac-dms-replicate-lfn <LFN> CERN-USER

For storage on Castor you use CERN-USER, for storage on SARA-TIER1 you use NIKHEF-USER (IS THIS CORRECT????)

ALSO THIS SHOULD BE EASIER RIGHT? CAN'T I JUST TELL GANGA WHERE I WANT TO STORE THINGS?



General comments


TREAT HERE THE INPUTDATA SHIT WITH LFNs OF DSTs USING TRISTANS CODE SNIPPET


If you work on a 64-bit machine (which you do when logged in on lxplus) you get an error message saying that only slc4_ia32_gcc34 is allowed on the Grid. So you should type

setenv CMTCONFIG slc4_ia32_gcc34

For some general Ganga commands take a look here.


THE FOLLOWING ONE IS OUT OF DATE, BUT JUST TO SHOW THAT ~/.gangarc IS IMPORTANT:

If in your ~/.gangarc file you change the following line

outputsandbox_types = ['NTupleSvc', 'HistogramPersistencySvc', 'MicroDSTStream\']

into this one

outputsandbox_types = []

the MicroDSTStream does not go to your outputsandbox, but to 'dataoutput' which is the Grid SE (Storage Element), so in case of backend.Dirac() it is

/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/<first four digits of jobID>/<jobID>

Where <letter> stands for the first letter in your Cern User ID <UID>.

While if you chose backend.Local() then 'dataoutput' is

/castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata

You find the jobID in the following way: In Ganga do

jobs(<gangajobnr>).backend.id

or in case you splitted the job in subjobs:

jobs("<gangajobnr.subjobnumber>").backend.id

Basic Castor commands

Here are some basic commands that you can use on Castor:

  • Copy a file from a local directory to CASTOR:
rfcp MyFileName /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName
  • Copy a file from CASTOR to a local directory:
rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName MyFileName
  • Copy a file from one CASTOR location to another:
rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory1/MyFileName1 /castor/cern.ch/<letter>/<UID>/MyDirectory2/MyFileName2
  • Delete a file from CASTOR (to delete a directory use option -r):
rfrm /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName
  • Create a directory in CASTOR:
rfmkdir /castor/cern.ch/user/<letter>/<UID>/MyNewDirectory/
  • Move a file in CASTOR:
rfrename /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyOldFileName /castor/cern.ch/<letter>/<UID>/MyDirectory/MyNewFileName
  • List the contents of a CASTOR directory:
rfdir /castor/cern.ch/user/<letter>/<UID>/MyDirectory

or

nsls -l /castor/cern.ch/user/<letter>/<UID>/MyDirectory

Copying microDSTs from Grid to Sara Tier1

First we want to make a directory at the Sara Tier1 where we can store our files.

Find Gerhard's script

then your grid script

then rfmkdir

Now we can copy the files from Castor to the LHCb storage on the Sara Amsterdam Tier1. We can do this either by starting the Dirac environment (CHECKTHIS!)

Or by manually entering the following that enabels us to use the lcg-cp command:

Do:

lhcb-proxy-init
source /afs/cern.ch/project/gd/LCG-Share/current/etc/profile.d/grid_env.csh

Now you can use the lcg-commands. Look here for an explanation of some lcg commands.

For copying use:

lcg-cp -v srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/MyFileName
srm://tbn18.nikhef.nl/dpm/nikhef.nl/home/lhcb/<username>/MyFileName

The directory

Stageing the Sara microDSTs on Stoomboot

Reading microDSTs

microDSTReadingExample.py

of maak wat nieuws in Bender

Analysis

P2VV

getpack PhysFit/P2VV getpack PhysFit/P2VVPython

and use the RooFit based fit package