AnalysisAtNikhef

From ALICE Wiki
Revision as of 08:54, 21 June 2024 by Panosch@nikhef.nl (talk | contribs) (→‎Running jobs on HTCondor)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

General information

This is an index of Technical/computing info for the ALICE group at Nikhef. For support on network, printing, etc, please check the CT wiki page.

If you are a student from Utrecht University, you may also want to consult the instructions for computing at the university.

If you want to launch your analysis at the batch farm of Nikhef and you try to connect outside the Nikhef domain then first connect to the login server

ssh <username>@login.nikhef.nl

Please note that you should never try to run anything on this machine (e.g. not even using a browser), since this is the entry point for everybody to the Nikhef domain. For light things (e.g. browsing a web page, downloading a paper etc) use ssh again to connect to one of the desktops of our group\

In case you need to run some analysis interactively that needs to access data stored on the common disk then you need to connect on one of the interactive nodes:

stbc-i1, stbc-i2, stbc-i3, stbc-i4

Storage information

  • Each user is allocated a limited space under the home directory. Please note that you should use this space for keeping light-weighted files e.g. macros, pdf files,...
  • As a group we are given 300GB under /project/alice that is used to install our common software there. Please do not use this space to store large files e.g. root files. This directory can be accessed by all machines i.e. both desktops (see above the names) and the interactive and batch nodes of Nikhef.
  • The current common storage of Nikhef is based on dcache (for more information about the system, check this link), an efficient system to store and retrieve large files. It is accessed only by the interactive and batch nodes of Nikhef. As a group we have ~300TB of disk space under /dcache/alice. This is currently the place where you can store your files for analysis, production etc. In case you want access to it please send a mail to Panos Christakoglou. Note that this storage won't be seen by the AliEn but rather it is reserved for local usage. It has the possibility to see the GRID file catalogue and thus allows copying productions (i.e. the typical use case of this storage for the group).
  • Some scratch disk space is available under /data/alice

External connection information

Please see the separate guide about using VSCode with a connection to Nikhef

ALICE software and how to access it

The main ALICE software packages are called O2 and O2Physics for Run 3 data analysis and AliRoot and AliPhysics for Run 1 and 2 data. They all use ROOT for the production of histograms, and for storing information in files, using root trees.

For more information on the usage of ALICE software, please consult the ALICE Analysis tutorial and the O2 analysis documentation

The instructions below contain specific information for using O2Physics and ROOT on Nikhef PCs and the Stoomboot cluster.

Precompiled software

To get started, you can probably used a precompiled version of the the ALICE software (O2/O2Physics for Run 3 analysis, AliRoot/Physics for Run 1 and 2 data) be accessed via cvmfs - a wide area network file share, hosted at CERN, which is available on all machines at Nikhef. To get it working, first add the following line to your .bashrc (or whichever shell script you use):

source /cvmfs/alice.cern.ch/etc/login.sh

(NB: this assumes that you are using the bash shell; you can check with 'echo $SHELL' and if needed, change using this link: https://sso.nikhef.nl/chsh/ )

A full list of packages is available on the MonaLisa website (the list is searchable; the link gives O2Physics packages, but you can change that). You can also get a list on the command line by doing:

alienv q | grep O2

but this is very slow since the information is read from a remote disk

If the alienv command is not recognised, you can try instead

/cvmfs/alice.cern.ch/bin/alienv q | grep O2

here and in the following lines.

To load the ALICE environment on your Nikhef node, use:

alienv enter VO_ALICE@O2Physics::daily-20240220-0100-1

This loads O2Physics from 20 February 2024; you can pick a later date by changing the tag. All dependencies, i.e. other packages that are needed for O2Physics, are also loaded automatically. To find out which versions are used, use

alienv list

If you plan to write your own code to add to O2Physics, you have to compile O2Physics locally, following the instructions below.

Compiling your own version of O2Physics

If you would like to add or modify analysis code in O2Physics, you can either do this on your own computer (Linux or Mac) or you can use a Nikhef PC or the stoomboot interactive nodes. The first step is to set up a development environment and compile and install the software on the PC that you are using. Note that the code and compiled output are large (Gigabytes) so you have to use a large disk, for example /data/alice/yourusername. Before starting on this, you may want to check with your supervisor if this is really needed, or if the use of the cvmfs version above is sufficient. The latter is much more convenient and is automatically kept up to date.

In the following I will call the path to the directory on /data $YOUR_DIR, feel free to insert a full path or define it in any other way. The steps in general follow the ones outlined in the aliBuild documentation, but some useful tricks are outlined below.

It is crucial to avoid any conflicts with existing software loaded via CVMFS, so you need to make sure you shell is clean. Check your ~/.bashrc and remove anything that loads cvmfs related things (e.g. source /cvmfs/alice.cern.ch/etc/login.sh). In case you want to still be able to load software via cvmfs, you can for example create a small shell script which executes all commands and load it as you need it.

In general you need to achieve the following things:

  1. Update pip3 (not pip!) from version 9.X to 21.X
  2. Update aliBuild to the most recent version
  3. Get and build the O2 related software

Setting up the software environment -- one time steps

For step one and two (according to my own experience™), it is important to use the python3 version of pip, which is done by using the pip3 and not the pip command! Furthermore, you do not have sudo rights on the stbc, and you have to install pip/alibuild with the --user option. For this to work, you need to provide a directory where those things can be installed to by defining

export PYTHONUSERBASE="${YOUR_DIR}/user_python"
export PATH="$PYTHONUSERBASE/bin:$PATH"

for example in your .bashrc - according to my own experience™, this does not conflict with the use of cvmfs and is generally safe to use.

Next, in order to update pip3 make sure you either open a new (clean) terminal or you do source ~/.bashrc , and check if it worked by using echo $PYTHONUSERBASE which in case it worked should ouptut the patch to the user_python directory. In case you want to check, the current pip3 version can be retrieved by using pip3 --version, which at the time of writing this gives you version 9.X.

To commence the upgrade of pip3, execute

pip3 install --upgrade pip --user

You can check the pip3 version again to see if worked - it should now be version 21.X.

Next up, you can get the most recent version of aliBuild, to do so just execute

pip3 install alibuild --upgrade --user

This concludes steps 1) and 2). The prerequisites listed here , should be (at least at the time of writing this) available and up to date. In order to install O2, I am going to assume you will install it to ${YOUR_DIR}/alice. Its convenient to execute the following lines

export ALIBUILD_WORK_DIR="${YOUR_DIR}/alice/sw"
eval "`alienv shell-helper`"

whenever you use O2 from your personal installation. You need to judge if you want to define this in your .bashrc (and load it each time you open a new terminal on stbc) or if you want to have a dedicated shell script for this. Next up go to the directory (cd ${YOUR_DIR}/alice/ ) and execute

aliBuild init O2@dev --defaults o2

It might be enough - or even necessary if you want to use e.g the Run3Analysisvalidation framework - for you to use the "lightweight" installation of O2 by using aliBuild init O2Physics@master --defaults o2 , your supervisor should be able to tell you.

Compilation and loading the environment scripts

After completion of these commands you should see a directory called O2 and alidist. Grab a coffee/other work/Bob Ross episodes/vacations (this can take up to 8 hrs or more), and start the build of the O2 software by executing

aliBuild build O2 --defaults o2

(for the "lightweight" installation the command should be aliBuild build O2Physics --defaults o2). You can use the package by loading the environment like this:

   alienv enter O2/latest-dev-o2

Things to note:

In case you want to check you are good to go with the build, you can invoke the aliDoctor via:

aliDoctor O2 --defaults o2or aliDoctor O2Physics --defaults o2

The output should say that you are good to go, but some things cannot be picked up from the system and have to be build from scratch.

The stbc has a lot of RAM, so there shoudn't be any issues related to that. Its worth to note, that O2 is hungry for memory when compiled in parallel and to crash on machines with 32 GB of memory. If this happens (maybe the cluster is busy), you can let the build reach the point it crashes, and then restart it using aliBuild build O2 -j 1 --defaults o2 (feel free to increase the number of threats by using the option -j 2 or 3 ...).

Last but not least, in case you don't define ALIBUILD_WORK_DIR and didn't load the alienv shell-helper you can only enter the O2 environment by navigating to the alice directory (e.g. cd $YOUR_DIR/alice) and executing alienv enter O2/latest-dev-o2there.

Troubleshooting

If pip3 --version gives you a ModuleNotFoundError, this may be due to multiple python installations being used at the same time. You can check this by comparing the output of which python3, which pip and which pip3. A possible fix is to remove the user_python directory and to reinstall pip3 and alibuild. You can do this by executing:

rm -r ${YOUR_DIR}/user_python

mkdir ${YOUR_DIR}/user_python

python3 -m pip install --upgrade pip --user

python3 -m pip install alibuild --upgrade --user

Verify that you have the correct version of pip (pip3 --version) and continue with the installation.


Running jobs on HTCondor

Nikhef is moving to a new batch system software which is based on HTCondor.

Practical information on how to use the system can be found under the CT wiki.

Some basic scripts that can be used to run on the cluster can be downloaded from this link. Note that you need to change both the submit and the run scripts accordingly!!!

Running analyses on the Stoomboot cluster

We have two local computer clusters:

A new storage is deployed at Nikhef which is based on dcache. Overall, close to 300TB of disk space are available for the group. This storage is intended to be used for local, batch (i.e. using stoomboot) analysis of data samples that are moved to dcache.

Running instructions for AliPhysics, Run 1/2 data

Currently the LHC10h (AOD160) and the low intensity runs of LHC15o Pb-Pb periods are stored under /dcache/alice/panosch/alice/data/. Below you can find a template of a script, called submit.sh, (it can certainly be written better) that allows you to launch a series of batch jobs and analyze either the 2010 or 2015 data. The way to run it is from one of the Nikhef desktops you do

source submit.sh lhc10h.txt 2010 

where the text file contains all the run numbers of 2010 copied at Nikhef and 2010 indicates the production year.

#!/bin/bash
SCRIPT="runAnalysis.sh"
while IFS= read -r runNumber || -n "$runNumber" ; do
    echo "Adding run number from file: $runNumber"

#make the script to submit
    (echo "#!/bin/bash"
echo "source /cvmfs/alice.cern.ch/etc/login.sh"
echo "eval $(alienv printenv VO_ALICE@AliPhysics::vAN-20161005-1)"
echo "which aliroot || exit 1"
if [ "$2" == "2010" ]
then
    echo "cd /dcache/alice/panosch/alice/data/2010/LHC10h/AOD160/$runNumber"
elif [ "$2" == "2015" ]
then
    echo "cd /dcache/alice/panosch/alice/data/2015/LHC15o/000$runNumber"
    echo "cd pass2_lowIR"
else 
    exit
fi
echo "pwd"
echo "if [ -f AnalysisResults.root ]"
echo "  then "
echo "rm -rf AnalysisResults.root"
echo "fi"
echo "if [ ! -f runFlowPIDSPTask.C ]"
echo " then "
echo "ln -s /user/panosch/ALICE/Flow/HigherHarmonics/Stoomboot/runFlowPIDSPTask.C ." 
echo "fi"
echo "exec aliroot -b -q runFlowPIDSPTask.C"
    ) > $SCRIPT 

qsub -q stbcq $SCRIPT 

done < "$1"

Using singularity to run Jetscape at Nikhef

On the Nikhef stoomboot cluster and login nodes, singularity is available to run code in 'containers' (a kind of system inside a system mode), which is for example useful for running the Jetscape generator.

The singularity executable is available in this directory:

/cvmfs/oasis.opensciencegrid.org/mis/singularity/bin/singularity 

You can use the full path name, or put an alias in your bashrc file.

The steps to obtain and run Jetscape are:

1) Download/checkout Jetscape from github following Step 2, point 1, of Jetscape Docker instructions.

NB: it is better not to use your home directory ~/jetscape-docker, but make a dedicated directory on the project disk ( /project/alice/users/$USER )

2) 'Pull' the docker container that is mentioned under point 2:

/cvmfs/oasis.opensciencegrid.org/mis/singularity/bin/singularity pull docker://jetscape/base:v1.4
this probably fails with a 'disk quota Exceeded' message. To fix this, move your singularity cache to project:
mv ~/.singularity/cache /project/alice/users/$USER/singularity-cache
ln -s /project/alice/users/$USER/singularity-cache ~/.singularity/cache
(Another way to achieve this is by setting the SINGULARITY_CACHEDIR environment variable)
and try again. This has downloaded some cache files and produces a singularity configuration file named: base_v1.4.sif

3) Enter the Jetscape container:

/cvmfs/oasis.opensciencegrid.org/mis/singularity/bin/singularity run --home /project/alice/users/$USER/jetscape-docker:/home/jetscape-user base_v1.4.sif 
you still have to do 'cd' once to get to the home directory.

4) Compile Jetscape, following step 2.3 from the Jetscape instruction

You now have a container that is ready to run Jetscape. For running instructions, see the Jetscape summer school