Difference between revisions of "LCMAPS Tracking GroupID plugin"

From PDP/Grid Wiki
Jump to navigationJump to search
Line 8: Line 8:
 
* LSF
 
* LSF
 
* Torque/PBS
 
* Torque/PBS
 +
 +
/*
  
 
=== Why do we need this plugin? ===
 
=== Why do we need this plugin? ===
Line 19: Line 21:
 
       │  │              └─glexec /bin/bash payload.sh
 
       │  │              └─glexec /bin/bash payload.sh
 
       │  │                  ├─payload.sh
 
       │  │                  ├─payload.sh
 +
 +
*/

Revision as of 14:01, 9 April 2011

Tracking Group IDs are added to batch jobs to be able to track them regardless if they escape the process tree.

Batch systems that use this feature are:

  • Sun Grid Engine (SGE, now known as the Oracle Grid Engine)
  • Condor-C batch system

Other batch systems are known to have the feature, but it doesn't seem to be used in (known) Grid deployments:

  • LSF
  • Torque/PBS

/*

Why do we need this plugin?

Here is an example process tree on a PBS/Torque based cluster Worker Node. For illustration purposes all non-relative processes are removed from the tree:

init-+
     ├─pbs_mom
     │   ├─bash
     │   │   └─1337.stro.n /var/spool/pbs/mom_priv/jobs/1337.stro.nikhef.nl.SC
     │   │       └─jobwrapper /opt/lcg/libexec/jobwrapper ./CREAM31337_jobWrapper.sh
     │   │           └─CREAM31337_ -l ./CREAM31337_jobWrapper.sh
     │   │               └─glexec /bin/bash payload.sh
     │   │                   ├─payload.sh
  • /