AnalysisAtNikhef

From ALICE Wiki
Revision as of 10:00, 13 January 2022 by Hohbern@nikhef.nl (talk | contribs) (additional information regarding the O2Physics installation for the Run3Analysisvalidation)
Jump to navigation Jump to search

Running analyses at Nikhef

General information

This is an index of Technical/computing info for the ALICE group

If you want to launch your analysis at the batch farm of Nikhef and you try to connect outside the Nikhef domain then first connect to the login server

ssh <username>@login.nikhef.nl

Please note that you should never try to run anything on this machine (e.g. not even using a browser), since this is the entry point for everybody to the Nikhef domain. For light things (e.g. browsing a web page, downloading a paper etc) use ssh again to connect to one of the desktops of our group:

mole, hamme, blavet, inn, jaba, todd, nass

In case you need to run some analysis interactively that needs to access data stored on the common disk then you need to connect on one of the interactive nodes:

stbc-i1, stbc-i2, stbc-i4

Storage information

  • Each user is allocated a limited space under the home directory. Please note that you should use this space for keeping light-weighted files e.g. macros, pdf files,...
  • As a group we are given 300GB under /project/alice that is used to install our common software there. Please do not use this space to store large files e.g. root files. This directory can be accessed by all machines i.e. both desktops (see above the names) and the interactive and batch nodes of Nikhef.
  • The current common storage of Nikhef is based on dcache (for more information about the system, check this link), an efficient system to store and retrieve large files. It is accessed only by the interactive and batch nodes of Nikhef. As a group we have ~300TB of disk space under /dcache/alice. This is currently the place where you can store your files for analysis, production etc. In case you want access to it please send a mail to Panos Christakoglou. Note that this storage won't be seen by the AliEn but rather it is reserved for local usage. It has the possibility to see the GRID file catalogue and thus allows copying productions (i.e. the typical use case of this storage for the group).

ALICE software and how to access it

The ALICE software is not installed anymore locally at Nikhef. If you still require a version to be installed (e.g. for debugging, development) please send a mail to Panos Christakoglou, indicating the tag you need.

Accessing the ALICE software can now be done via cvmfs which is centrally installed on all machines at Nikhef. To get it working, first add the following line to your .bashrc (or whichever shell script you use):

source /cvmfs/alice.cern.ch/etc/login.sh

(NB: this assumes that you are using the bash shell; you can check with 'echo $SHELL' and if needed, change using this link: https://sso.nikhef.nl/chsh/ )

To list the available modules type:

alienv q | grep AliPhysics

(NB: if the alienv command is not recognised, try instead

/cvmfs/alice.cern.ch/bin/alienv q | grep AliPhysics

here and in the following lines.)

To load the ALICE environment on your Nikhef node, use:

alienv enter VO_ALICE@AliPhysics::vAN-20211101-1

by replacing the physics tag with the one you want to use

Running analyses on the Stoomboot cluster

We have two local computer clusters:

A new storage is deployed at Nikhef which is based on dcache. Overall, close to 300TB of disk space are available for the group. This storage is intended to be used for local, batch (i.e. using stoomboot) analysis of data samples that are moved to dcache. Currently the LHC10h (AOD160) and the low intensity runs of LHC15o Pb-Pb periods are stored under /dcache/alice/panosch/alice/data/. Below you can find a template of a script, called submit.sh, (it can certainly be written better) that allows you to launch a series of batch jobs and analyze either the 2010 or 2015 data. The way to run it is from one of the Nikhef desktops you do

source submit.sh lhc10h.txt 2010 

where the text file contains all the run numbers of 2010 copied at Nikhef and 2010 indicates the production year.

#!/bin/bash
SCRIPT="runAnalysis.sh"
while IFS= read -r runNumber || -n "$runNumber" ; do
    echo "Adding run number from file: $runNumber"

#make the script to submit
    (echo "#!/bin/bash"
echo "source /cvmfs/alice.cern.ch/etc/login.sh"
echo "eval $(alienv printenv VO_ALICE@AliPhysics::vAN-20161005-1)"
echo "which aliroot || exit 1"
if [ "$2" == "2010" ]
then
    echo "cd /dcache/alice/panosch/alice/data/2010/LHC10h/AOD160/$runNumber"
elif [ "$2" == "2015" ]
then
    echo "cd /dcache/alice/panosch/alice/data/2015/LHC15o/000$runNumber"
    echo "cd pass2_lowIR"
else 
    exit
fi
echo "pwd"
echo "if [ -f AnalysisResults.root ]"
echo "  then "
echo "rm -rf AnalysisResults.root"
echo "fi"
echo "if [ ! -f runFlowPIDSPTask.C ]"
echo " then "
echo "ln -s /user/panosch/ALICE/Flow/HigherHarmonics/Stoomboot/runFlowPIDSPTask.C ." 
echo "fi"
echo "exec aliroot -b -q runFlowPIDSPTask.C"
    ) > $SCRIPT 

qsub -q stbcq $SCRIPT 

done < "$1"

Using singularity to run Jetscape at Nikhef

On the Nikhef stoomboot cluster and login nodes, singularity is available to run code in 'containers' (a kind of system inside a system mode), which is for example useful for running the Jetscape generator.

The singularity executable is available in this directory:

/cvmfs/oasis.opensciencegrid.org/mis/singularity/bin/singularity 

You can use the full path name, or put an alias in your bashrc file.

The steps to obtain and run Jetscape are:

1) Download/checkout Jetscape from github following Step 2, point 1, of Jetscape Docker instructions.

NB: it is better not to use your home directory ~/jetscape-docker, but make a dedicated directory on the project disk ( /project/alice/users/$USER )

2) 'Pull' the docker container that is mentioned under point 2:

/cvmfs/oasis.opensciencegrid.org/mis/singularity/bin/singularity pull docker://jetscape/base:v1.4
this probably fails with a 'disk quota Exceeded' message. To fix this, move your singularity cache to project:
mv ~/.singularity/cache /project/alice/users/$USER/singularity-cache
ln -s /project/alice/users/$USER/singularity-cache ~/.singularity/cache
(Another way to achieve this is by setting the SINGULARITY_CACHEDIR environment variable)
and try again. This has downloaded some cache files and produces a singularity configuration file named: base_v1.4.sif

3) Enter the Jetscape container:

/cvmfs/oasis.opensciencegrid.org/mis/singularity/bin/singularity run --home /project/alice/users/$USER/jetscape-docker:/home/jetscape-user base_v1.4.sif 
you still have to do 'cd' once to get to the home directory.

4) Compile Jetscape, following step 2.3 from the Jetscape instruction

You now have a container that is ready to run Jetscape. For running instructions, see the Jetscape summer school

Developing code in O2 on the Cluster

In case you need to develop code directly within O2 (and not just run scripts) you might want to have the source code available and be able to compile it. Make sure to check back with your supervisor if this is really needed, or if the use of the cvmfs version above is sufficient. The latter is much more convenient and is automatically kept up to date.

Start by following the General information above to login to one of the stbc nodes. You will need around 50 GB a lot of diskspace -_-, so you cannot install this in your home directory! Check with your supervisor on where to put the installation and the code ideally. In the following I will call the path to the directory $YOUR_DIR, feel free to insert a full path or define it in any other way. The steps in general follow the ones outlined in the aliBuild documentation, but some useful tricks are outlined below.

It is crucial to avoid any conflicts with existing software loaded via CVMFS, so you need to make sure you shell is clean. Check your ~/.bashrc and remove anything that loads cvmfs related things (e.g. source /cvmfs/alice.cern.ch/etc/login.sh). In case you want to still be able to load software via cvmfs, you can for example create a small shell script which executes all commands and load it as you need it.

In general you need to achieve the following things:

  1. Update pip3 (not pip!) from version 9.X to 21.X
  2. Update aliBuild to the most recent version
  3. Get and build the O2 related software

For step one and two (according to my own experience™), it is important to use the python3 version of pip, which is done by using the pip3 and not the pip command! Furthermore, you do not have sudo rights on the stbc, and you have to install pip/alibuild with the --user option. For this to work, you need to provide a directory where those things can be installed to by defining

export PYTHONUSERBASE="${YOUR_DIR}/user_python"

export PATH="$PYTHONUSERBASE/bin:$PATH"

for example in your .bashrc - according to my own experience™, this does not conflict with the use of cvmfs and is generally safe to use.

Next, in order to update pip3 make sure you either open a new (clean) terminal or you do source ~/.bashrc , and check if it worked by using echo $PYTHONUSERBASE which in case it worked should ouptut the patch to the user_python directory. In case you want to check, the current pip3 version can be retrieved by using pip3 --version, which at the time of writing this gives you version 9.X.

To commence the upgrade of pip3, execute

pip3 install --upgrade pip --user

You can check the pip3 version again to see if worked - it should now be version 21.X.

Next up, you can get the most recent version of aliBuild, to do so just execute

pip3 install alibuild --upgrade --user

This concludes steps 1) and 2). The prerequisites listed here , should be (at least at the time of writing this) available and up to date. In order to install O2, I am going to assume you will install it to ${YOUR_DIR}/alice. Its convenient to execute the following lines

export ALIBUILD_WORK_DIR="${YOUR_DIR}/alice/sw"

eval "`alienv shell-helper`"

whenever you use O2 from your personal installation. You need to judge if you want to define this in your .bashrc (and load it each time you open a new terminal on stbc) or if you want to have a dedicated shell script for this. Next up go to the directory (cd ${YOUR_DIR}/alice/ ) and execute

aliBuild init O2@dev --defaults o2

It might be enough - or even necessary if you want to use e.g the Run3Analysisvalidation framework - for you to use the "lightweight" installation of O2 by using aliBuild init O2Physics@master --defaults o2 , your supervisor should be able to tell you. After completion of these commands you should see a directory called O2 and alidist. Grab a coffe/other work/Bob Ross episodes/vacations (this can take up to 8 hrs or more), and start the build of the O2 software by executing

aliBuild build O2 --defaults o2 --always-prefer-system

(for the "lightweight" installation the command should be aliBuild build O2Physics --defaults o2 --always-prefer-system).

After a successful build, it should tell you something like this (this might differ in case you build O2Physics):

  You can use this package by loading the environment:

   alienv enter O2/latest-dev-o2

Things to note:

In case you want to check you are good to go with the build, you can invoke the aliDoctor via:

aliDoctor O2 --defaults o2or aliDoctor O2Physics --defaults o2

The output should say that you are good to go, but some things cannot be picked up from the system and have to be build from scratch.

The option --always-prefer-system forces aliBuild to build everything from scratch. This means you won't use the precompiled binaries, which would be 100 times faster, however I found this necessary (feel free to fix this part ...) since the precompiled binaries need a symlink which alibuild doesn't seem to be able to create.

The stbc has a lot of RAM, so there shoudn't be any issues related to that. Its worth to note, that O2 is hungry for memory when compiled in parallel and to crash on machines with 32 GB of memory. If this happens (maybe the cluster is busy), you can let the build reach the point it crashes, and then restart it using aliBuild build O2 -j 1 --defaults o2 --always-prefer-system (feel free to increase the number of threats by using the option -j 2 or 3 ...).

Last but not least, in case you don't define ALIBUILD_WORK_DIR and didn't load the alienv shell-helper you can only enter the O2 environment by navigating to the alice directory (e.g. cd $YOUR_DIR/alice) and executing alienv enter O2/latest-dev-o2there.