Difference between revisions of "Produce and read microDSTs using the Grid"
| (8 intermediate revisions by one other user not shown) | |||
| Line 42: | Line 42: | ||
| == Producing microDSTs on the Grid == | == Producing microDSTs on the Grid == | ||
| − | + | First, follow the instructions on howto [[Become a Grid user]] | |
| − | + | ''' Configure Ganga ''' | |
| + | |||
| + | To setup Ganga, follow the instructions on [[Setting up Ganga for use at Nikhef]]. | ||
| + | Then start Ganga in a new shell: | ||
| + | <pre> | ||
| + | SetupProject Ganga | ||
| + | ganga | ||
| + | </pre> | ||
| + | |||
| + | Now we still have to configure the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job/MicroDST_Ganga.py | ||
| + | |||
| + | First set the correct DaVinci version in this line: | ||
| + | |||
| + | <pre> | ||
| + | dv = DaVinci( version = 'vxxrxpx' ) | ||
| + | </pre> | ||
| + | |||
| + | Set in ganga the optionsfile that writes your mDSTs: | ||
| + | <pre> | ||
| + | dv.optsfile = 'optionfile.py' | ||
| + | </pre> | ||
| − | + | Initialize the job: | |
| + | <pre> | ||
| + | j = Job( Application = dv, name='name' ) | ||
| + | </pre> | ||
| − | + | Set the name of the outputfile. | |
| + | <pre> | ||
| + | j.outputdata = ['filename.dst'] | ||
| + | </pre> | ||
| − | '''  | + | QUESTION: DOES IT MATTER IF THIS ONE IS DIFFERENT THEN THE ONE IN YOUR OPTIONS FILE? | 
| + | |||
| + | Now choose the desired backend.  | ||
| + | |||
| + | '''Backend Local''' | ||
| + | |||
| + | To run jobs locally: | ||
| + | |||
| + | <pre> | ||
| + | j.backend    = Local()   | ||
| + | </pre> | ||
| − | + | When all options are set correctly, submit the jobs typing: | |
| <pre> | <pre> | ||
| − | + | ganga MicroDST_Ganga.py | |
| </pre> | </pre> | ||
| − | + | The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output file(s) itself go to /castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata. | |
| + | |||
| + | '''Backend Dirac''' | ||
| + | |||
| + | To run jobs on the Grid using Dirac: | ||
| + | |||
| + | <pre> | ||
| + | j.backend    = Dirac()   | ||
| + | </pre> | ||
| − | + | The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output is now a list of LFNs. To get a local copy (that is, in ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output) of the produced mDST start ganga and type | |
| <pre> | <pre> | ||
| − | + | jobs(ganga jobnummer).backend.getOutputData() | |
| </pre> | </pre> | ||
| − | + | If you want a copy on a Grid SE like castor you type | |
| <pre> | <pre> | ||
| − | + | jobs(ganga jobnummer).backend.getOutputDataLFNs() | |
| </pre> | </pre> | ||
| − | + | to obtain the LFNs of the produced mDSTs. These can be used to make copies of the produced mDSTs on local or Grid SE. | |
| − | + | For a copy on a Grid SE, open a shell and type | |
| <pre> | <pre> | ||
| − | + | SetupProject Dirac | |
| + | lhcb-proxy-init | ||
| + | dirac-dms-replicate-lfn <LFN> CERN-USER | ||
| </pre> | </pre> | ||
| − | + | For storage on Castor you use CERN-USER, for storage on SARA-TIER1 you use NIKHEF-USER (IS THIS CORRECT????) | |
| + | |||
| + | ALSO THIS SHOULD BE EASIER RIGHT? CAN'T I JUST TELL GANGA WHERE I WANT TO STORE THINGS? | ||
| + | |||
| + | |||
| + | |||
| + | |||
| + | '''General comments''' | ||
| + | |||
| + | ---- | ||
| + | TREAT HERE THE INPUTDATA SHIT WITH LFNs OF DSTs USING TRISTANS CODE SNIPPET | ||
| + | ---- | ||
| + | |||
| + | If you work on a 64-bit machine (which you do when logged in on lxplus) you get an error message saying that only slc4_ia32_gcc34 is allowed on the Grid. So you should type | ||
| <pre> | <pre> | ||
| Line 86: | Line 145: | ||
| </pre> | </pre> | ||
| − | + | ---- | |
| − | + | For some general Ganga commands take a look [https://twiki.cern.ch/twiki/bin/view/LHCb/GangaTutorial1 here]. | |
| + | |||
| + | ---- | ||
| + | |||
| + | THE FOLLOWING ONE IS OUT OF DATE, BUT JUST TO SHOW THAT ~/.gangarc IS IMPORTANT: | ||
| If in your ~/.gangarc file you change the following line | If in your ~/.gangarc file you change the following line | ||
| Line 115: | Line 178: | ||
| /castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata | /castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata | ||
| </pre> | </pre> | ||
| + | |||
| + | ---- | ||
| You find the jobID in the following way: In Ganga do | You find the jobID in the following way: In Ganga do | ||
| Line 156: | Line 221: | ||
| :<span style="color:Maroon;"><tt>nsls -l /castor/cern.ch/user/<letter>/<UID>/MyDirectory</tt></span> | :<span style="color:Maroon;"><tt>nsls -l /castor/cern.ch/user/<letter>/<UID>/MyDirectory</tt></span> | ||
| − | == Copying microDSTs from Grid to Sara == | + | == Copying microDSTs from Grid to Sara Tier1 == | 
| + | |||
| + | First we want to make a directory at the Sara Tier1 where we can store our files. | ||
| + | |||
| + | Find Gerhard's script | ||
| + | |||
| + | then your grid script | ||
| + | |||
| + | then rfmkdir | ||
| Now we can copy the files from Castor to the LHCb storage on the Sara Amsterdam Tier1. We can do this either by starting the Dirac environment (CHECKTHIS!) | Now we can copy the files from Castor to the LHCb storage on the Sara Amsterdam Tier1. We can do this either by starting the Dirac environment (CHECKTHIS!) | ||
| Line 173: | Line 246: | ||
| For copying use: | For copying use: | ||
| − | < | + | <pre> | 
| − | + | lcg-cp -v srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/MyFileName | |
| − | </ | + | srm://tbn18.nikhef.nl/dpm/nikhef.nl/home/lhcb/<username>/MyFileName | 
| + | </pre> | ||
| + | |||
| + | The directory | ||
| == Stageing the Sara microDSTs on Stoomboot == | == Stageing the Sara microDSTs on Stoomboot == | ||
Latest revision as of 19:06, 27 March 2010
Producing microDSTs standalone
Getting the relevant packages and setting the environment
SetupProject DaVinci v22r0p2 --build-env
SetupProject DaVinci v22r0p2
getpack Phys/DaVinci v22r0p2
getpack PhysSel/Ccbar
getpack Ex/MicroDSTExample
cd ${User_release_area}/DaVinci_v22r0p2/Phys/DaVinci_v22r0p2/Phys/DaVinci/cmt
gmake
cd ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/cmt
gmake
cd ${User_release_area}/DaVinci_v22r0p2/PhysSel/Ccbar/cmt
gmake
Making the microDST
Do
cd ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job
gaudirun.py ../options/TestMicroDSTMake.py
ls -ltr
You should now see a .dst file in this directory which has just been created. For this standalone production of microDSTs take care that the following lines are set properly in the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/options/TestMicroDSTMake.py:
#importOptions( "$MICRODSTEXAMPLEROOT/options/JpsiPhiDataLFN.py") importOptions( "$MICRODSTEXAMPLEROOT/options/JpsiPhiDataPFN.py")
This takes care of using the PFN (physical file name) this time instead of the LFN (logical file name), which we will use later working with the Grid.
Producing microDSTs on the Grid
First, follow the instructions on howto Become a Grid user
Configure Ganga
To setup Ganga, follow the instructions on Setting up Ganga for use at Nikhef. Then start Ganga in a new shell:
SetupProject Ganga ganga
Now we still have to configure the file ${User_release_area}/DaVinci_v22r0p2/Ex/MicroDSTExample/job/MicroDST_Ganga.py
First set the correct DaVinci version in this line:
dv = DaVinci( version = 'vxxrxpx' )
Set in ganga the optionsfile that writes your mDSTs:
dv.optsfile = 'optionfile.py'
Initialize the job:
j = Job( Application = dv, name='name' )
Set the name of the outputfile.
j.outputdata = ['filename.dst']
QUESTION: DOES IT MATTER IF THIS ONE IS DIFFERENT THEN THE ONE IN YOUR OPTIONS FILE?
Now choose the desired backend.
Backend Local
To run jobs locally:
j.backend = Local()
When all options are set correctly, submit the jobs typing:
ganga MicroDST_Ganga.py
The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output file(s) itself go to /castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata.
Backend Dirac
To run jobs on the Grid using Dirac:
j.backend = Dirac()
The stderr and stdout get sent to the local ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output, while the output is now a list of LFNs. To get a local copy (that is, in ~/gangadir/workspace/<user>/LocalAMGA/<ganga jobnummer>/output) of the produced mDST start ganga and type
jobs(ganga jobnummer).backend.getOutputData()
If you want a copy on a Grid SE like castor you type
jobs(ganga jobnummer).backend.getOutputDataLFNs()
to obtain the LFNs of the produced mDSTs. These can be used to make copies of the produced mDSTs on local or Grid SE.
For a copy on a Grid SE, open a shell and type
SetupProject Dirac lhcb-proxy-init dirac-dms-replicate-lfn <LFN> CERN-USER
For storage on Castor you use CERN-USER, for storage on SARA-TIER1 you use NIKHEF-USER (IS THIS CORRECT????)
ALSO THIS SHOULD BE EASIER RIGHT? CAN'T I JUST TELL GANGA WHERE I WANT TO STORE THINGS?
General comments
TREAT HERE THE INPUTDATA SHIT WITH LFNs OF DSTs USING TRISTANS CODE SNIPPET
If you work on a 64-bit machine (which you do when logged in on lxplus) you get an error message saying that only slc4_ia32_gcc34 is allowed on the Grid. So you should type
setenv CMTCONFIG slc4_ia32_gcc34
For some general Ganga commands take a look here.
THE FOLLOWING ONE IS OUT OF DATE, BUT JUST TO SHOW THAT ~/.gangarc IS IMPORTANT:
If in your ~/.gangarc file you change the following line
outputsandbox_types = ['NTupleSvc', 'HistogramPersistencySvc', 'MicroDSTStream\']
into this one
outputsandbox_types = []
the MicroDSTStream does not go to your outputsandbox, but to 'dataoutput' which is the Grid SE (Storage Element), so in case of backend.Dirac() it is
/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/<first four digits of jobID>/<jobID>
Where <letter> stands for the first letter in your Cern User ID <UID>.
While if you chose backend.Local() then 'dataoutput' is
/castor/cern.ch/user/<letter>/<UID>/<ganga jobnumber>/outputdata
You find the jobID in the following way: In Ganga do
jobs(<gangajobnr>).backend.id
or in case you splitted the job in subjobs:
jobs("<gangajobnr.subjobnumber>").backend.id
Basic Castor commands
Here are some basic commands that you can use on Castor:
- Copy a file from a local directory to CASTOR:
- rfcp MyFileName /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName
- Copy a file from CASTOR to a local directory:
- rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName MyFileName
- Copy a file from one CASTOR location to another:
- rfcp /castor/cern.ch/user/<letter>/<UID>/MyDirectory1/MyFileName1 /castor/cern.ch/<letter>/<UID>/MyDirectory2/MyFileName2
- Delete a file from CASTOR (to delete a directory use option -r):
- rfrm /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyFileName
- Create a directory in CASTOR:
- rfmkdir /castor/cern.ch/user/<letter>/<UID>/MyNewDirectory/
- Move a file in CASTOR:
- rfrename /castor/cern.ch/user/<letter>/<UID>/MyDirectory/MyOldFileName /castor/cern.ch/<letter>/<UID>/MyDirectory/MyNewFileName
- List the contents of a CASTOR directory:
- rfdir /castor/cern.ch/user/<letter>/<UID>/MyDirectory
or
- nsls -l /castor/cern.ch/user/<letter>/<UID>/MyDirectory
Copying microDSTs from Grid to Sara Tier1
First we want to make a directory at the Sara Tier1 where we can store our files.
Find Gerhard's script
then your grid script
then rfmkdir
Now we can copy the files from Castor to the LHCb storage on the Sara Amsterdam Tier1. We can do this either by starting the Dirac environment (CHECKTHIS!)
Or by manually entering the following that enabels us to use the lcg-cp command:
Do:
lhcb-proxy-init source /afs/cern.ch/project/gd/LCG-Share/current/etc/profile.d/grid_env.csh
Now you can use the lcg-commands. Look here for an explanation of some lcg commands.
For copying use:
lcg-cp -v srm://srm-lhcb.cern.ch/castor/cern.ch/grid/lhcb/user/<letter>/<UID>/MyFileName srm://tbn18.nikhef.nl/dpm/nikhef.nl/home/lhcb/<username>/MyFileName
The directory
Stageing the Sara microDSTs on Stoomboot
Reading microDSTs
microDSTReadingExample.py
of maak wat nieuws in Bender
Analysis
P2VV
getpack PhysFit/P2VV getpack PhysFit/P2VVPython
and use the RooFit based fit package