Difference between revisions of "Stoomboot"

From Atlas Wiki
Jump to navigation Jump to search
m
 
(14 intermediate revisions by 2 users not shown)
Line 7: Line 7:
 
=== Hardware ===
 
=== Hardware ===
  
Stoomboot consists of 16 nodes (stbc-01 through stbc-16) that are each
+
Stoomboot consists of 32 nodes (<tt>stbc-01</tt> through <tt>stbc-32</tt>) that are each
 
a equipped with dual quad-core Intel Xeon E5335 2.0 Ghz processors and
 
a equipped with dual quad-core Intel Xeon E5335 2.0 Ghz processors and
16 Gb of memory. The total number of cores is 128.
+
16 Gb of memory. The total number of cores is 256.
  
 
=== Software & disk access ===
 
=== Software & disk access ===
  
All stoomboot nodes run Scientific Linux 4.7. All NFS mountable disks
+
All stoomboot nodes run Scientific Linux 5. All NFS mountable disks
at NIKHEF are visible (<tt>/project/*</tt> and <tt>/data/*</tt>). Stoomboot does not run
+
at NIKHEF are visible (<tt>/project/*</tt> and <tt>/data/*</tt>), as well as all GlusterFS disks (<tt>/glusterfs/atlas*</tt>). Stoomboot does not run
 
AFS so no AFS directories including <tt>/afs/cern.ch</tt> are not visible. This
 
AFS so no AFS directories including <tt>/afs/cern.ch</tt> are not visible. This
 
may indirectly impact you as certain experimental software installations
 
may indirectly impact you as certain experimental software installations
attempt to access files on <tt>/afs/cern.ch</tt>. As stoomboot is intended as a local
+
attempt to access files on <tt>/afs/cern.ch</tt> ([[CVMFS]] is available for software installations). As stoomboot is intended as a local
 
batch farm there are no plans to install AFS.
 
batch farm there are no plans to install AFS.
  
Line 40: Line 40:
 
The output of the job appears in files named <tt><jobname>.o<number></tt>, e.g. <tt>test.sh.o9714</tt> in example of previous page. The following default settings
 
The output of the job appears in files named <tt><jobname>.o<number></tt>, e.g. <tt>test.sh.o9714</tt> in example of previous page. The following default settings
 
apply when you submit a batch job
 
apply when you submit a batch job
 +
  
 
* Job runs in home directory (<tt>$HOME</tt>)
 
* Job runs in home directory (<tt>$HOME</tt>)
Line 45: Line 46:
 
* Job output (stdout) is sent to a file in directory in which job was submitted. Job stderr output is sent to separate file E.g. for example of previous slide file <tt>test.sh.o9714</tt> contains stdout and file <tt>test.sh.e9714</tt> contains stderr. If there is no stdout or stderr, an empty file is created
 
* Job output (stdout) is sent to a file in directory in which job was submitted. Job stderr output is sent to separate file E.g. for example of previous slide file <tt>test.sh.o9714</tt> contains stdout and file <tt>test.sh.e9714</tt> contains stderr. If there is no stdout or stderr, an empty file is created
 
* A mail is sent to you the output files cannot be created
 
* A mail is sent to you the output files cannot be created
 +
  
 
Here is a listed of frequently desired changes in default behavior and their corresponding
 
Here is a listed of frequently desired changes in default behavior and their corresponding
 
option in <tt>qsub</tt>
 
option in <tt>qsub</tt>
 +
  
 
*''Merge stdout and stderr in a single file''. Add option <tt>-j oe</tt> to <tt>qsub</tt> command (single file <tt>*.o*</tt> is written)
 
*''Merge stdout and stderr in a single file''. Add option <tt>-j oe</tt> to <tt>qsub</tt> command (single file <tt>*.o*</tt> is written)
  
* ''Choose batch queue''.  Right now there are two queues: <tt>test</tt> (30 min) and <tt>qlong</tt> (48h) Add option <tt>-q <queuename></tt> to qsub command
+
* ''Choose batch queue''.  Right now there are five queues: <tt>stbcq</tt> (the default queue, 8 hours), <tt>express</tt> (10 min), <tt>short</tt> (4 hours), <tt>qlong</tt> (48h) and <tt>budget</tt> (low priority). Add option <tt>-q <queuename></tt> to qsub command
  
 
* ''Choose different output file for stdout''. Add option <tt>-o <filename></tt> to <tt>qsub</tt> command
 
* ''Choose different output file for stdout''. Add option <tt>-o <filename></tt> to <tt>qsub</tt> command
  
 
* ''Pass all environment variables of submitting shell to batch job (with exception of <tt>$PATH</tt>)''.  Add option <tt>-V</tt> to <tt>qsub</tt> command
 
* ''Pass all environment variables of submitting shell to batch job (with exception of <tt>$PATH</tt>)''.  Add option <tt>-V</tt> to <tt>qsub</tt> command
 +
 +
* '' Run the job on a specific stoomboot node ''. Add option <tt> -l host=stbc-XX</tt> to the <tt> qsub</tt> command line.
 +
  
 
A full list of options can be obtained from <tt>man qsub</tt>
 
A full list of options can be obtained from <tt>man qsub</tt>
Line 61: Line 67:
 
=== Examining the status of your jobs ===
 
=== Examining the status of your jobs ===
  
The <tt>qstat</tt> command shows the stats of all your jobs. Status code 'C' indicates completed, 'R' indicates running and 'Q' indicates queued.
+
The <tt>qstat -u <username></tt> command shows the stats of all your jobs. Status code 'C' indicates completed, 'R' indicates running and 'Q' indicates queued.
  
 
<pre>
 
<pre>
Line 70: Line 76:
 
</pre>
 
</pre>
  
The <tt>qstat</tt> command only shows your own jobs, not those of other users.
+
This <tt>qstat</tt> command only shows your own jobs, not those of other users.
Only completed jobs that completed less than 10 minutes ago are listsed with status 'C'.
+
Only completed jobs that completed less than 10 minutes ago are listed with status 'C'.
 
Output of jobs that completed longer ago is kept, but they are simply no longer
 
Output of jobs that completed longer ago is kept, but they are simply no longer
 
listed in the status overview.
 
listed in the status overview.
Line 81: Line 87:
 
The general level of activity on stoomboot is graphically monitored in this location
 
The general level of activity on stoomboot is graphically monitored in this location
 
http://www.nikhef.nl/grid/stats/stbc/
 
http://www.nikhef.nl/grid/stats/stbc/
 +
 +
== Common practical issues in stoomboot user ==
  
 
=== LSF job submission emulator ===
 
=== LSF job submission emulator ===
Line 108: Line 116:
 
If you want to debug a problem that occurs on a stoomboot batch job, or you want to make a short trial run for a larger series of batch jobs there are two ways to gain interactive login access to stoomboot.
 
If you want to debug a problem that occurs on a stoomboot batch job, or you want to make a short trial run for a larger series of batch jobs there are two ways to gain interactive login access to stoomboot.
  
* You can directly login to node stbc-16 (this only ''only'') to test and/or debug your problem. You should keep CPU consumption and testing time to a minimum as regularly scheduled batch jobs run on this machine too.
+
* You can directly login to nodes stbc-i1 through stbc-i4 (these nodes ''only'') to test and/or debug your problem. You should try to keep CPU consumption and testing time to a minimum, and run your real jobs through <tt>qsub</tt> on the actual nodes.
  
* You can request an 'interactive' batch job through <tt>qsub -q qlong -X -I</tt>. In this mode you can consume as much CPU resources as the queue that the interactive job was submitted to allows. The 'look and feel' of interactive bacth jobs is nearly identical to that of <tt>ssh</tt>. The main exception is that when no free job slot is available the </tt>qsub</tt> command will hang until one becomes available.
+
* You can request an 'interactive' batch job through <tt>qsub -q qlong -X -I</tt>. In this mode you can consume as much CPU resources as the queue that the interactive job was submitted to allows. The 'look and feel' of interactive batch jobs is nearly identical to that of <tt>ssh</tt>. The main exception is that when no free job slot is available the <tt>qsub</tt> command will hang until one becomes available.
  
 
=== Scratch disk usages and NFS disk access ===
 
=== Scratch disk usages and NFS disk access ===
  
 
When running on stoomboot please be sure to locate all local 'scratch' files to the directory pointed to by the environment variable <tt>$TMPDIR</tt> and ''not'' <tt>/tmp</tt>. The latter is very small (a few Gb) and when filled up will give all kinds of problems for you and other users. The disk pointed to by <tt>$TMPDIR</tt> is typically 200 Gb. Also here be sure to clean up when your job ends to avoid filling up these disk as well.
 
When running on stoomboot please be sure to locate all local 'scratch' files to the directory pointed to by the environment variable <tt>$TMPDIR</tt> and ''not'' <tt>/tmp</tt>. The latter is very small (a few Gb) and when filled up will give all kinds of problems for you and other users. The disk pointed to by <tt>$TMPDIR</tt> is typically 200 Gb. Also here be sure to clean up when your job ends to avoid filling up these disk as well.
 +
 +
When accessing NFS mounted disks (<tt>/project/*</tt>, <tt>/data/*</tt>) please keep in mind that the network bandwidth between stoomboot nodes and the NFS server is limited and that the NFS server capacity is also limited. Running e.g. 50 jobs that read from or write to files on NFS disks at a high rate ('ntuple analysis') may result in poor performance of both the NFS server and your jobs.
  
 
=== Scheduling policies and CPU quota ===
 
=== Scheduling policies and CPU quota ===
Line 126: Line 136:
 
To ask questions and to receive announcements on stoomboot operations, subscribe
 
To ask questions and to receive announcements on stoomboot operations, subscribe
 
to the stoomboot users mailing list (stbc-users@nikhef.nl). To subscribe yourself
 
to the stoomboot users mailing list (stbc-users@nikhef.nl). To subscribe yourself
to this list go to https://mailman.nikhef.nl/cgi-bin/listinfo/stbc-users.
+
to this list go to https://mailman.nikhef.nl/mailman/listinfo/stbc-users.

Latest revision as of 20:35, 15 August 2013

What is stoomboot

Stoomboot is a batch farm for local use at NIKHEF. It is in principle open to all NIKHEF users, but a login account does not give automatic access to stoomboot. Contact helpdesk@nikhef.nl to gain access

Hardware

Stoomboot consists of 32 nodes (stbc-01 through stbc-32) that are each a equipped with dual quad-core Intel Xeon E5335 2.0 Ghz processors and 16 Gb of memory. The total number of cores is 256.

Software & disk access

All stoomboot nodes run Scientific Linux 5. All NFS mountable disks at NIKHEF are visible (/project/* and /data/*), as well as all GlusterFS disks (/glusterfs/atlas*). Stoomboot does not run AFS so no AFS directories including /afs/cern.ch are not visible. This may indirectly impact you as certain experimental software installations attempt to access files on /afs/cern.ch (CVMFS is available for software installations). As stoomboot is intended as a local batch farm there are no plans to install AFS.

How to use stoomboot

Submitting batch jobs

Stoomboot is a batch-only facilities and jobs can be submitted through the PBS qsub command

unix> qsub test.sh
9714.allier.nikhef.nl

The argument passed to qsub is a script that will be executed in your home directory. The returned string is the job identifier and can be used to look up the status of the job, or to manipulate it later. Jobs can be submitted from any linux desktop at nikhef as well as login.nikhef.nl. If you cannot submit jobs from your local desktop, contact helpdesk@nikhef.nl to have the batch client software installed.

The output of the job appears in files named <jobname>.o<number>, e.g. test.sh.o9714 in example of previous page. The following default settings apply when you submit a batch job


  • Job runs in home directory ($HOME)
  • Job starts with clean shell (any environment variable from the shell from which you submit are not transferred to batch job) E.g. if you need ATLAS software setup, it should be done in the submitted script
  • Job output (stdout) is sent to a file in directory in which job was submitted. Job stderr output is sent to separate file E.g. for example of previous slide file test.sh.o9714 contains stdout and file test.sh.e9714 contains stderr. If there is no stdout or stderr, an empty file is created
  • A mail is sent to you the output files cannot be created


Here is a listed of frequently desired changes in default behavior and their corresponding option in qsub


  • Merge stdout and stderr in a single file. Add option -j oe to qsub command (single file *.o* is written)
  • Choose batch queue. Right now there are five queues: stbcq (the default queue, 8 hours), express (10 min), short (4 hours), qlong (48h) and budget (low priority). Add option -q <queuename> to qsub command
  • Choose different output file for stdout. Add option -o <filename> to qsub command
  • Pass all environment variables of submitting shell to batch job (with exception of $PATH). Add option -V to qsub command
  • Run the job on a specific stoomboot node . Add option -l host=stbc-XX to the qsub command line.


A full list of options can be obtained from man qsub

Examining the status of your jobs

The qstat -u <username> command shows the stats of all your jobs. Status code 'C' indicates completed, 'R' indicates running and 'Q' indicates queued.

unix> qstat
Job id              Name             User            Time Use S Queue
------------------- ---------------- --------------- -------- - -----
9714.allier         test.sh          verkerke        00:00:00 C test

This qstat command only shows your own jobs, not those of other users. Only completed jobs that completed less than 10 minutes ago are listed with status 'C'. Output of jobs that completed longer ago is kept, but they are simply no longer listed in the status overview.

To see activity of other users on the system you can use the lower-level maui command showq which will show jobs of all users. The showq command works without arguments on login, on any other host add --host=allier to run it successfully.

The general level of activity on stoomboot is graphically monitored in this location http://www.nikhef.nl/grid/stats/stbc/

Common practical issues in stoomboot user

LSF job submission emulator

The interface of the PBS batch system is notably different from the LSF batch system that is run at e.g. CERN and SLAC. One of the convenient features of LSF bsub is that the user does not need to write a script for every batch job, but that a command line that is passed to bsub is executed. An emulator is available for the LSF bsub command that submits a job that executes the bsub command line in the present working directory and the complete present envirnment. For example one can do

  bsub ls -l 

which will submit a batch job that executes ls -l in the working directory from which the bsub command was executed. This script expressly allows the user to setup e.g. the complete ATLAS software environment in a shell on the local desktop and then substitute local desktop running of an ATLAS software job with a batch-run job by prefixing bsub to the executed command line. The scope of the LSF bsub emulator is limited to its ability to execute the command line in batch in an identical environment. It does not emulate the various command line flags of LSF bsub. You can find the bsub emulator for now in ~verkerke/bin/bsub


Suggestions for debugging and trouble shooting

If you want to debug a problem that occurs on a stoomboot batch job, or you want to make a short trial run for a larger series of batch jobs there are two ways to gain interactive login access to stoomboot.

  • You can directly login to nodes stbc-i1 through stbc-i4 (these nodes only) to test and/or debug your problem. You should try to keep CPU consumption and testing time to a minimum, and run your real jobs through qsub on the actual nodes.
  • You can request an 'interactive' batch job through qsub -q qlong -X -I. In this mode you can consume as much CPU resources as the queue that the interactive job was submitted to allows. The 'look and feel' of interactive batch jobs is nearly identical to that of ssh. The main exception is that when no free job slot is available the qsub command will hang until one becomes available.

Scratch disk usages and NFS disk access

When running on stoomboot please be sure to locate all local 'scratch' files to the directory pointed to by the environment variable $TMPDIR and not /tmp. The latter is very small (a few Gb) and when filled up will give all kinds of problems for you and other users. The disk pointed to by $TMPDIR is typically 200 Gb. Also here be sure to clean up when your job ends to avoid filling up these disk as well.

When accessing NFS mounted disks (/project/*, /data/*) please keep in mind that the network bandwidth between stoomboot nodes and the NFS server is limited and that the NFS server capacity is also limited. Running e.g. 50 jobs that read from or write to files on NFS disks at a high rate ('ntuple analysis') may result in poor performance of both the NFS server and your jobs.

Scheduling policies and CPU quota

This section is sensitive to changes as scheduling policies and quota allocation are still evolving. At the time of writing (December 2008) each group (atlas,bphys etc...) is allowed to use at most 96 run slots (i.e. 75% of the available capacity, this is the hard limit). When the system is 'busy', as determined by the maui scheduler a lower soft limit of 64 run slots is enforced (50% of the capacity). Each individual user is entitled to use all run slots of his group. To see what policy prevents your queued jobs from running use the checkjob <jobid> command.


Questions, communication and announcements on stoomboot

To ask questions and to receive announcements on stoomboot operations, subscribe to the stoomboot users mailing list (stbc-users@nikhef.nl). To subscribe yourself to this list go to https://mailman.nikhef.nl/mailman/listinfo/stbc-users.