Difference between revisions of "GLExec"
| Line 126: | Line 126: | ||
| * [[GLExec TransientPilotJobs]] describes how you may go about managing a target workload's transient area. | * [[GLExec TransientPilotJobs]] describes how you may go about managing a target workload's transient area. | ||
| − | * [[ | + | * [[GLExec Wrap and Unwrap environment variables]] describes how you can wrap environment variables in such a way that they don't get wipe and unwrap them (safely) in the target account (pilot job payload) process. | 
| = Manual and documentation = | = Manual and documentation = | ||
Revision as of 10:30, 31 May 2009
gLExec is a program to make the required mapping between the grid world and the Unix notion of users and groups, and has the capacity to enforce that mapping by modifying the uid and gids of running processes. Based on LCMAPS and LCMAPS, it can both act as a light-weight 'gatekeeper' replacement, and even be used on the worker node in late-binding (pilot job) scenarios. Through the LCMAPS SCAS client a central mapping and authorization service (SCAS, or any interoperable SAML2XACML2 service) can be used.
gLExec is a program to make the required mapping between the grid world and the Unix notion of users and groups, and has the capacity to enforce that mapping by modifying the uid and gids of running processes. It uses LCAS and LCMAPS for access control and the mapping engine. For a service running under a 'generic' uid, such as a web services container, it provides the way to escape from this container uid. It may be used similarly by externally managed services run on a site.s edge. Lastly, in a late-binding scenario, the identity of the workload owner can be set at the instant the job starts executing.
The description, design and caveats are described in the paper to the CHEP conference.
Local services, in particular computing services offered on Unix [5] and Unix-like platforms, use a different native representation of the user and group concepts. In the Unix domain, these are expressed as (numeric) identifiers, where each user is assigned a user identifier (uid) and one or more group identifiers (gid). At any one time, a single gid will be the .primary. gid (pgid) of a particular process, This pgid is initially used for group-level process (and batch system) accounting. The uid and gid representation is local to each administrative domain.
Batch system interoperability
When used on a worker node (in a late binding pilot job scenario), gLExec attempts really hard to be neutral to its OS environment. In particular, gLExec will not break the process tree, and will accumulate CPU and system usage times from the child processes it spawns. We recognize that this is particularly important in the gLExec-on-WN scenario, where the entire process (pilot job and target user processes) should be managed as a whole by the node-local batch system daemon.
You are encouraged to verify OS and batch system interoperability. In order to do that, you have two options:
- Comprehensive testing: Ulrich Schwickerath has defined a series of (partially CERN-specific) tests to verify that glExec does not break the batch system setup of a site. He has extensively documented his efforts on the Wiki at https://twiki.cern.ch/twiki/bin/view/FIOgroup/FsLSFGridglExec. Note that the Local Tools section is CERN-specific. If you use other tools to clean up the user's work area (such as the $tmpdir facility of PBSPro and Troque), or use the PruneUserproc utility to remove stray processes, you are not affected by this.
- Basic OS and batch-system testing can be done even without installing glExec, but just compiling a simple C program with one hard-coded uid for testing. This is the fastest solution for testing, but only verifies that your batch system reacts correctly, not that your other grid-aware system script will work as you expect.
The following batch systems are known to be compatible with gLExec-on-the-Worker-Node:
- Torque, all versions
- OpenPBS, all versions
- Platform LSF, all versions
- BQS, all versions
- Condor, all versions
If you notice any anomalies after testing, i.e. the job will not die, please notify the developers at grid dash mw dash security at nikhef dot nl.
Deploying gLExec on the worker node
The preferred way to deploy gLExec on the worker node is by using (VO-agnostic) generic pool accounts that are local to each worker node. This way, you can be sure that a gLExec'ed job does not "escape" from the node, and it limits the number of pool accounts needed. For this configuration, you
- create at least as many pool accounts as you have job slots on a WN
- assign a worker node local gridmapdir (suggestion: /var/local/gridmapdir)
- create local pool accounts with a local home directory (suggestion: account names wnpool00 etc, and home directories in a local file system that has enough space, e.g., /var/local/home/poolwn00, etc.)
- configure the lcmaps.db configuration used by glexec to refer to this gridmapdir
Note that the /var/run/glexec directory is used to maintain the mapping between the target and the originator account for easy back-mapping for running jobs. This information is of course also logged to syslog(3).
If you like shared pool accounts, you can use a shared atomic state database (implemented as an NFS directory) to host the gridmapdir. All operations on the gridmapdir are atomic, even over NFS, and it scales really well (remember that NFS is still the file sharing mechanism of choice for many large installations)
Detailed documentation is given at http://www.nikhef.nl/grid/lcaslcmaps/glexec/glexec-install-procedure.html.
Using the SCAS
If you prefer to use LCMAPS with the SCAS service, add the scas-client plugin to the set of RPMs, and configure the SCAS client. You would add to /opt/glite/etc/lcmaps/lcmaps-glexec.db:
scasclient = "lcmaps_scas_client.mod"
            " -capath /etc/grid-security/certificates/"
            " -endpoint https://graszaad.nikhef.nl:8443"
            " -resourcetype wn"
            " -actiontype execute-now"
and the following policy execution flow at the end:
# policies glexec_get_account: verify_proxy -> scasclient scasclient -> posix_enf
Using gLExec in a pilot job framework
When you use glexec with transient directories and input sandboxes, it's important that you create a writable directory for your target job, and you do this in a safe and portable way. We provide a proof-of-principle imple,entation on hwo to create such a directory, and clean up after yourself here:
- https://ndpfsvn.nikhef.nl/cgi-bin/viewvc.cgi/pdpsoft/trunk/grid-mw-security/glexec/util/mkgltempdir/
See also the more extensive text on GLExec TransientPilotJobs.
Exit Codes
The error code that glexec returns:
201 - client error, which includes:
- no proxy is provided
- wrong proxy permissions
- target location is not accessible
- the binary to execute does not exist
- the mapped user has no rigths to execute the binary when GLEXEC_CLIENT_CERT is not set
202 - system error
- glexec.conf is not present or malformed
- lcas or lcmaps initialization failure, can be obtained moving the lcas/lcmaps db files.
203 - authorization error
- user is not whitelisted
- local lcas authorization failure
- user banned by the SCAS server
- lcmaps failure on the scas server
- SCAS server not running
- network cable unplugged on the SCAS server host.
204 - exit code of the called application overlap with the previous ones
- application called by glexec exit with code 201, 202, 203 or 204
Deployment scenarios in EGEE and OSG
The way gLExec is installed depends a bit on the chosen scenario and the way authorization in done in your infrastructure. Have a look at these installation and deployment guides for more information:
- gLExec installations in Open Science Grid
- YAIM supported installation in EGEE, both YAIM site-info.def variables and a specific section for gLExec on worker nodes installed with YAIM
- Installing gLExec on the worker node (setuid) manually is described here.
Pilot Job framework information and How To's
Need to Know's
The gLExec executable is installable in two ways, with an without the setuid (file system) bit on root. With the setuid-bit enabled on root, this effectively means that gLExec is being executed with root privileges. Without the setuid or setgid bits on root the gLExec executable is like any other regular executable.
The safety features of gLExec are implemented with great care to avoid misuse and exploitation by anybody who executes it. As gLExec is typically installed with a setuid bit on root, this effectively means that anybody on the system is able to execute something with root privileges for a brief moment of time to perform the user switch.
A couple of safety features that are build in the gLExec tool are:
- The LD_LIBRARY_PATH, LD_RUN_PATH and other LD_* environment variables are removed from the process environment by the Operating System before the first line of gLExec code is executed by a Unix and Linux system. Only the /etc/ld.so.conf{.d/}, RPATH settings and other system specific paths are used and resolved. This statement holds for any setuid or setgid executable.
- The rest of the environment is stripped off by gLExec. There are a couple of environment settings that can easily lead to a root exploit in the standard library of a Unix and Linux system. Only the GLEXEC_* environment variables are kept. There is an option in the glexec.conf file to preserve more variables, but these must be selected with great care and setup by each System Administrator on all their machines.
- If the target user is authorized and when a mapping and Unix process identity switch the HOME and X509_USER_PROXY will be rewritten. Their value will contain the paths that are relevant for the target user account.
- The target user process has the Unix identity as mapped by LCMAPS. This could be from a separate set of pool accounts, or the regular set of pool accounts as given by the same user credentials from an LCG-CE or CREAM-CE. It could be a poolaccount defined locally on the machine. The only assumption that holds is that the target user account has the privileges that are appointed to them by the local site administrator.
How To's
To help you master the obstacles of gLExec's security we offer some interesting How To material:
- GLExec TransientPilotJobs describes how you may go about managing a target workload's transient area.
- GLExec Wrap and Unwrap environment variables describes how you can wrap environment variables in such a way that they don't get wipe and unwrap them (safely) in the target account (pilot job payload) process.
