Difference between revisions of "Produce and read microDSTs using the Grid"

From LHCb Wiki
Jump to navigation Jump to search
 
(25 intermediate revisions by one other user not shown)
Line 42: Line 42:
 
== Producing microDSTs on the Grid ==
 
== Producing microDSTs on the Grid ==
  
''' Become a gRidder '''
+
First, follow the instructions on howto [[Become a Grid user]]
  
First obtain a Grid Certificate from the [http://ca.dutchgrid.nl/ Dutch Grid CA].
+
''' Configure Ganga '''
  
When you've done this, become a member of the LHCb VO by following this [https://lcg-voms.cern.ch:8443/vo/lhcb/vomrs link].
+
To setup Ganga, follow the instructions on [[Setting up Ganga for use at Nikhef]].
 +
Then start Ganga in a new shell:
 +
<pre>
 +
SetupProject Ganga
 +
ganga
 +
</pre>
  
To use the Grid from lxplus you need to copy your certificate to lxplus. You do this by copying the entire .globus directory to your home on lxplus.
+
Now we still have to configure the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job/MicroDST_Ganga.py
  
''' Configure Ganga '''
+
First set the correct DaVinci version in this line:
 +
 
 +
<pre>
 +
dv = DaVinci( version = 'vxxrxpx' )
 +
</pre>
 +
 
 +
Set in ganga the optionsfile that writes your mDSTs:
 +
<pre>
 +
dv.optsfile = 'optionfile.py'
 +
</pre>
  
To setup Ganga do
+
Initialize the job:
 +
<pre>
 +
j = Job( Application = dv, name='name' )
 +
</pre>
  
 +
Set the name of the outputfile.
 
<pre>
 
<pre>
GangaEnv
+
j.outputdata = ['filename.dst']
 
</pre>
 
</pre>
  
Now we still have to configure the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job/MicroDST_Ganga.py
+
QUESTION: DOES IT MATTER IF THIS ONE IS DIFFERENT THEN THE ONE IN YOUR OPTIONS FILE?
 +
 
 +
Now choose the desired backend.
 +
 
 +
'''Backend Local'''
 +
 
 +
To run jobs locally:
 +
 
 +
<pre>
 +
j.backend    = Local() 
 +
</pre>
  
First set the correct DaVinci version in this line:
+
When all options are set correctly, submit the jobs typing:
  
 
<pre>
 
<pre>
dv = DaVinci( version = 'v22r0p2' )
+
ganga MicroDST_Ganga.py
 
</pre>
 
</pre>
  
Now choose the desired backend. To run jobs on the Grid uncomment
+
The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output file(s) itself go to /castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata.
 +
 
 +
'''Backend Dirac'''
 +
 
 +
To run jobs on the Grid using Dirac:
  
 
<pre>
 
<pre>
#j.backend    = Dirac()   
+
j.backend    = Dirac()   
 
</pre>
 
</pre>
  
ISSUE WITH LFN's PFN's?
+
The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output is now a list of LFNs. To get a local copy (that is, in ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output) of the produced mDST start ganga and type
  
When all options are set correctly, submit the jobs:
+
<pre>
 +
jobs(ganga jobnummer).backend.getOutputData()
 +
</pre>
 +
 
 +
If you want a copy on a Grid SE like castor you type
  
 
<pre>
 
<pre>
ganga MicroDST_Ganga.py
+
jobs(ganga jobnummer).backend.getOutputDataLFNs()
 
</pre>
 
</pre>
  
This is where you will be asked for your Grid Password. If you work on a 64-bit machine (which you do when logged in on lxplus) you get an error message saying that only slc4_ia32_gcc34 is allowed on the Grid. So you should type
+
to obtain the LFNs of the produced mDSTs. These can be used to make copies of the produced mDSTs on local or Grid SE.
 +
 
 +
For a copy on a Grid SE, open a shell and type
 +
 
 +
<pre>
 +
SetupProject Dirac
 +
lhcb-proxy-init
 +
dirac-dms-replicate-lfn <LFN> CERN-USER
 +
</pre>
 +
 
 +
For storage on Castor you use CERN-USER, for storage on SARA-TIER1 you use NIKHEF-USER (IS THIS CORRECT????)
 +
 
 +
ALSO THIS SHOULD BE EASIER RIGHT? CAN'T I JUST TELL GANGA WHERE I WANT TO STORE THINGS?
 +
 
 +
 
 +
 
 +
 
 +
'''General comments'''
 +
 
 +
----
 +
TREAT HERE THE INPUTDATA SHIT WITH LFNs OF DSTs USING TRISTANS CODE SNIPPET
 +
----
 +
 
 +
If you work on a 64-bit machine (which you do when logged in on lxplus) you get an error message saying that only slc4_ia32_gcc34 is allowed on the Grid. So you should type
  
 
<pre>
 
<pre>
Line 86: Line 145:
 
</pre>
 
</pre>
  
Now the jobs will be submitted. For some general Ganga commands take a look [https://twiki.cern.ch/twiki/bin/view/LHCb/GangaTutorial1 here].
+
----
 +
 
 +
For some general Ganga commands take a look [https://twiki.cern.ch/twiki/bin/view/LHCb/GangaTutorial1 here].
  
Now the output of these jobs (in this case the microDSTs) will go to your ~/gangadir. But if you are generating a lot of possibly large files, it is more convenient to let Ganga send these files somewhere else.
+
----
 +
 
 +
THE FOLLOWING ONE IS OUT OF DATE, BUT JUST TO SHOW THAT ~/.gangarc IS IMPORTANT:
  
 
If in your ~/.gangarc file you change the following line
 
If in your ~/.gangarc file you change the following line
Line 105: Line 168:
  
 
<pre>
 
<pre>
/castor/cern.ch/grid/lhcb/user/[first letter of username]/[username]/[first four digits of jobID]/[jobID]
+
/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/<first four digits of jobID>/<jobID>
 
</pre>
 
</pre>
 +
 +
Where <tt><letter></tt> stands for the first letter in your Cern User ID <tt><UID></tt>.
  
 
While if you chose backend.Local() then 'dataoutput' is
 
While if you chose backend.Local() then 'dataoutput' is
  
 
<pre>
 
<pre>
/castor/cern.ch/user/[first letter of username]/[username]/[ganga jobnumber]/outputdata
+
/castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata
 
</pre>
 
</pre>
 +
 +
----
  
 
You find the jobID in the following way: In Ganga do
 
You find the jobID in the following way: In Ganga do
  
 
<pre>
 
<pre>
jobs([gangajobnr]).backend.id
+
jobs(<gangajobnr>).backend.id
 
</pre>
 
</pre>
  
Line 123: Line 190:
  
 
<pre>
 
<pre>
jobs("[gangajobnr.subjobnumber]").backend.id
+
jobs("<gangajobnr.subjobnumber>").backend.id
 
</pre>
 
</pre>
  
Line 131: Line 198:
  
 
* Copy a file from a local directory to CASTOR:  
 
* Copy a file from a local directory to CASTOR:  
    <span style="color:Maroon;"><tt>rfcp MyFileName /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName</tt></span>
+
:<span style="color:Maroon;"><tt>rfcp MyFileName /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName</tt></span>
  
 
* Copy a file from CASTOR to a local directory:  
 
* Copy a file from CASTOR to a local directory:  
    <span style="color:Maroon;"><tt>rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName MyFileName</tt></span>
+
:<span style="color:Maroon;"><tt>rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName MyFileName</tt></span>
  
 
* Copy a file from one CASTOR location to another:  
 
* Copy a file from one CASTOR location to another:  
    <span style="color:Maroon;"><tt>rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory1/MyFileName1 /castor/cern.ch/<letter>/<UID>/MyDirectory2/MyFileName2</tt></span>
+
:<span style="color:Maroon;"><tt>rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory1/MyFileName1 /castor/cern.ch/<letter>/<UID>/MyDirectory2/MyFileName2</tt></span>
 +
 
 +
* Delete a file from CASTOR (to delete a directory use option -r):
 +
:<span style="color:Maroon;"><tt>rfrm /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName</tt></span>
 +
 
 +
* Create a directory in CASTOR:
 +
:<span style="color:Maroon;"><tt>rfmkdir /castor/cern.ch/user/<letter>/<UID>/MyNewDirectory/</tt></span>
  
 +
* Move a file in CASTOR:
 +
:<span style="color:Maroon;"><tt>rfrename /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyOldFileName /castor/cern.ch/<letter>/<UID>/MyDirectory/MyNewFileName</tt></span>
  
* Delete a file from CASTOR:  
+
* List the contents of a CASTOR directory:  
 +
:<span style="color:Maroon;"><tt>rfdir /castor/cern.ch/user/<letter>/<UID>/MyDirectory</tt></span>
  
<pre>
+
or
rfrm /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName
+
:<span style="color:Maroon;"><tt>nsls -l /castor/cern.ch/user/<letter>/<UID>/MyDirectory</tt></span>
</pre>
+
 
 +
== Copying microDSTs from Grid to Sara Tier1 ==
 +
 
 +
First we want to make a directory at the Sara Tier1 where we can store our files.
 +
 
 +
Find Gerhard's script
  
* Create a directory in CASTOR:
+
then your grid script
  
<pre>
+
then rfmkdir
rfmkdir /castor/cern.ch/user/<letter>/<UID>/MyNewDirectory/
 
</pre>
 
  
* Move a file in CASTOR:
+
Now we can copy the files from Castor to the LHCb storage on the Sara Amsterdam Tier1. We can do this either by starting the Dirac environment (CHECKTHIS!)
  
<pre>
+
Or by manually entering the following that enabels us to use the <tt>lcg-cp</tt> command:
rfrename /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyOldFileName /castor/cern.ch/<letter>/<UID>/MyDirectory/MyNewFileName
 
</pre>
 
  
* List the contents of a CASTOR directory:  
+
Do:
  
 
<pre>
 
<pre>
rfdir /castor/cern.ch/user/<letter>/<UID>/MyDirectory
+
lhcb-proxy-init
 +
source /afs/cern.ch/project/gd/LCG-Share/current/etc/profile.d/grid_env.csh
 
</pre>
 
</pre>
  
or
+
Now you can use the lcg-commands. Look [http://ppewww.physics.gla.ac.uk/~fergusjk/howtolcg.html here] for an explanation of some lcg commands.
  
<verbatim>
+
For copying use:
  nsls -l /castor/cern.ch.user/[first letter of username]/[username]
 
</verbatim>
 
  
<span style="color:Maroon;"><tt>silver text</tt></span>
+
<pre>
 
+
lcg-cp -v srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/MyFileName
== Copying microDSTs from Grid to Sara ==
+
srm://tbn18.nikhef.nl/dpm/nikhef.nl/home/lhcb/<username>/MyFileName
 +
</pre>
  
 +
The directory
  
 
== Stageing the Sara microDSTs on Stoomboot ==
 
== Stageing the Sara microDSTs on Stoomboot ==

Latest revision as of 21:06, 27 March 2010

Producing microDSTs standalone

Getting the relevant packages and setting the environment

SetupProject DaVinci v22r0p2 --build-env
SetupProject DaVinci v22r0p2

getpack Phys/DaVinci v22r0p2
getpack PhysSel/Ccbar
getpack Ex/MicroDSTExample

cd ${User_release_area}/DaVinci_v22r0p2/Phys/DaVinci_v22r0p2/Phys/DaVinci/cmt
gmake

cd ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/cmt
gmake

cd ${User_release_area}/DaVinci_v22r0p2/PhysSel/Ccbar/cmt
gmake

Making the microDST

Do

cd ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job
gaudirun.py ../options/TestMicroDSTMake.py
ls -ltr

You should now see a .dst file in this directory which has just been created. For this standalone production of microDSTs take care that the following lines are set properly in the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/options/TestMicroDSTMake.py:

#importOptions( "$MICRODSTEXAMPLEROOT/options/JpsiPhiDataLFN.py")                                                                                            
importOptions( "$MICRODSTEXAMPLEROOT/options/JpsiPhiDataPFN.py")

This takes care of using the PFN (physical file name) this time instead of the LFN (logical file name), which we will use later working with the Grid.

Producing microDSTs on the Grid

First, follow the instructions on howto Become a Grid user

Configure Ganga

To setup Ganga, follow the instructions on Setting up Ganga for use at Nikhef. Then start Ganga in a new shell:

SetupProject Ganga
ganga

Now we still have to configure the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job/MicroDST_Ganga.py

First set the correct DaVinci version in this line:

dv = DaVinci( version = 'vxxrxpx' )

Set in ganga the optionsfile that writes your mDSTs:

dv.optsfile = 'optionfile.py'

Initialize the job:

j = Job( Application = dv, name='name' )

Set the name of the outputfile.

j.outputdata = ['filename.dst']

QUESTION: DOES IT MATTER IF THIS ONE IS DIFFERENT THEN THE ONE IN YOUR OPTIONS FILE?

Now choose the desired backend.

Backend Local

To run jobs locally:

j.backend    = Local()  

When all options are set correctly, submit the jobs typing:

ganga MicroDST_Ganga.py

The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output file(s) itself go to /castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata.

Backend Dirac

To run jobs on the Grid using Dirac:

j.backend    = Dirac()  

The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output is now a list of LFNs. To get a local copy (that is, in ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output) of the produced mDST start ganga and type

jobs(ganga jobnummer).backend.getOutputData()

If you want a copy on a Grid SE like castor you type

jobs(ganga jobnummer).backend.getOutputDataLFNs()

to obtain the LFNs of the produced mDSTs. These can be used to make copies of the produced mDSTs on local or Grid SE.

For a copy on a Grid SE, open a shell and type

SetupProject Dirac
lhcb-proxy-init
dirac-dms-replicate-lfn <LFN> CERN-USER

For storage on Castor you use CERN-USER, for storage on SARA-TIER1 you use NIKHEF-USER (IS THIS CORRECT????)

ALSO THIS SHOULD BE EASIER RIGHT? CAN'T I JUST TELL GANGA WHERE I WANT TO STORE THINGS?



General comments


TREAT HERE THE INPUTDATA SHIT WITH LFNs OF DSTs USING TRISTANS CODE SNIPPET


If you work on a 64-bit machine (which you do when logged in on lxplus) you get an error message saying that only slc4_ia32_gcc34 is allowed on the Grid. So you should type

setenv CMTCONFIG slc4_ia32_gcc34

For some general Ganga commands take a look here.


THE FOLLOWING ONE IS OUT OF DATE, BUT JUST TO SHOW THAT ~/.gangarc IS IMPORTANT:

If in your ~/.gangarc file you change the following line

outputsandbox_types = ['NTupleSvc', 'HistogramPersistencySvc', 'MicroDSTStream\']

into this one

outputsandbox_types = []

the MicroDSTStream does not go to your outputsandbox, but to 'dataoutput' which is the Grid SE (Storage Element), so in case of backend.Dirac() it is

/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/<first four digits of jobID>/<jobID>

Where <letter> stands for the first letter in your Cern User ID <UID>.

While if you chose backend.Local() then 'dataoutput' is

/castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata

You find the jobID in the following way: In Ganga do

jobs(<gangajobnr>).backend.id

or in case you splitted the job in subjobs:

jobs("<gangajobnr.subjobnumber>").backend.id

Basic Castor commands

Here are some basic commands that you can use on Castor:

  • Copy a file from a local directory to CASTOR:
rfcp MyFileName /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName
  • Copy a file from CASTOR to a local directory:
rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName MyFileName
  • Copy a file from one CASTOR location to another:
rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory1/MyFileName1 /castor/cern.ch/<letter>/<UID>/MyDirectory2/MyFileName2
  • Delete a file from CASTOR (to delete a directory use option -r):
rfrm /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName
  • Create a directory in CASTOR:
rfmkdir /castor/cern.ch/user/<letter>/<UID>/MyNewDirectory/
  • Move a file in CASTOR:
rfrename /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyOldFileName /castor/cern.ch/<letter>/<UID>/MyDirectory/MyNewFileName
  • List the contents of a CASTOR directory:
rfdir /castor/cern.ch/user/<letter>/<UID>/MyDirectory

or

nsls -l /castor/cern.ch/user/<letter>/<UID>/MyDirectory

Copying microDSTs from Grid to Sara Tier1

First we want to make a directory at the Sara Tier1 where we can store our files.

Find Gerhard's script

then your grid script

then rfmkdir

Now we can copy the files from Castor to the LHCb storage on the Sara Amsterdam Tier1. We can do this either by starting the Dirac environment (CHECKTHIS!)

Or by manually entering the following that enabels us to use the lcg-cp command:

Do:

lhcb-proxy-init
source /afs/cern.ch/project/gd/LCG-Share/current/etc/profile.d/grid_env.csh

Now you can use the lcg-commands. Look here for an explanation of some lcg commands.

For copying use:

lcg-cp -v srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/MyFileName
srm://tbn18.nikhef.nl/dpm/nikhef.nl/home/lhcb/<username>/MyFileName

The directory

Stageing the Sara microDSTs on Stoomboot

Reading microDSTs

microDSTReadingExample.py

of maak wat nieuws in Bender

Analysis

P2VV

getpack PhysFit/P2VV getpack PhysFit/P2VVPython

and use the RooFit based fit package