Difference between revisions of "GLExec Epilogue Functionality"
Revision as of 13:35, 22 May 2012
Starting from version 0.9 gLExec can optionally run an epilogue executable after the payload has finished.
In linger mode, gLExec can optionally run a trusted executable, intended to clean up the payload environment. Whether it will run is triggered by the glexec option epilogue. The option should point to the absolute path of a trusted executable: it must not be possible for anyone except the root user (or the epilogue_user and/or members of the epilogue_group when set) to change the executable. It will run as uid/gid 0,0 (unless epilogue_user and/or epilogue_group are set). If it does not finish within a set epilogue_timeout, it will be send a SIGTERM. For proper functioning it is advised that gLExec will do the userswitch (instead of LCMAPS).
If the epilogue fails for whatever reason, gLExec will return either with a 202 exit code (internal gLExec error) or potentially a 204 (e.g. when the epilog itself returned a 201-204 range exit code).
The epilogue runs with stdin, stdout and stderr all attached to /dev/null. There is no special logging functionality implemented and this is left to the developer of the epilogue code.
The epilogue can be configured using the glexec.conf settings
|epilogue||when set, the name of the trusted binary or script to run. Needs to be a absolute canonical path|
|epilogue_user||When set, the epilogue will be run with this user identity. In addition this user is allowed to have write permission for the epilogue executable (i.e. is trusted). This option can only be used when gLExec does the userswitch. It can be useful if the script is located on an NFS with root squash. Default: root.|
|epilogue_group||When set, the epilogue will be run with this group identity. In addition members of this group are allowed to have write permission for the epilogue executable (i.e. are trusted). When unset, the executable will be run with GID 0 and no group will be trusted. This option can only be used when gLExec does the userswitch. It can be useful if the script is located on an NFS with root squash.|
|epilogue_timeout||The epilogue executable will run for at most this timeout in seconds, before being sent a SIGTERM (and SIGKILL). Default: 300 seconds.|
The epilogue runs with the same cleaned environment as gLExec sets up for the payload, with a number of additional variables, all starting with GLEXEC_EPILOG_. Any variables setup before gLExec starting with GLEXEC_EPILOG_ will be cleared before the epilogue is run.
|GLEXEC_EPILOG_ARGV<N>||argv of payload|
|GLEXEC_EPILOG_GLEXEC_USER||calling user username|
|GLEXEC_EPILOG_GLEXEC_GROUP||calling user's primary groupname|
|GLEXEC_EPILOG_GLEXEC_UID||calling user's uid|
|GLEXEC_EPILOG_GLEXEC_GID||calling user's primary gid|
|GLEXEC_EPILOG_GLEXEC_SGIDS||calling user's secondary gids, colon separated|
|GLEXEC_EPILOG_TARGET_USER||target user's username|
|GLEXEC_EPILOG_TARGET_GROUP||target user's primary groupname|
|GLEXEC_EPILOG_TARGET_UID||target user's uid|
|GLEXEC_EPILOG_TARGET_GID||target user's primary gid|
|GLEXEC_EPILOG_TARGET_SGIDS||target user's secondary gids, colon separated|
|GLEXEC_EPILOG_GLEXEC_PID||lingering gLExec process ID|
|GLEXEC_EPILOG_GLEXEC_SID||lingering gLExec session ID|
|GLEXEC_EPILOG_GLEXEC_PGID||lingering gLExec process group|
|GLEXEC_EPILOG_TARGET_PID||payload process ID|
|GLEXEC_EPILOG_TARGET_PGID||payload process group|
|GLEXEC_EPILOG_TARGET_RC||payload exit code|
- In order to prevent tampering with the epilogue binary or script, the permissions need to be such, that only the root user and optionally epilogue user, has write access to the file or one of its path members (it is "trusted-root").
- GLExec becomes immune to signals from any user but root.
- It is important to note that writing a epilogue should be done with utmost care:
- it will be ran (normally) by root user
- it is triggered automatically
- blindly killing all processes from the payload user can kill good processes
- Logging should be done in a secure way, e.g. to either syslog or to a trusted file location.
Providing a catch-all example script is not possible as this heavily depends on site details. Sites might want to have a look at Nikhef's reaper script, intended to clean up daemonized processes after a grid job has finished.