Upgrading Quattor managed glite servers
Updating a glite release
Updating a gLite release consists of the following steps:
- Synchronizing our copy of the gLite repository with the official release repository at CERN
- Generating Quattor templates for the updates and creating a new gLite update branch in the name space
- Compilation of the profiles for the hosts in the Installation Test Bed using the new gLite updates
- Deployment and troubleshooting
- Using the last gLite updates as default for all clusters
The following notes were taken during the upgrade from glite-3.1-update-29 to glite-3.1-update-31. These notes depent heavyly on our installation and might not be applicable at other Quattor installations without major changes, paste and copy will surely not work.
Synchronization of the local gLite repository
As ndpfmgr@stal
- Run the script ~/bin/mirror-glite-3.1
This synchronizes our gLite 3.1 mirror at stal (under directory /project/quattor/www/html/mirror/glite) with the official repository at CERN (fetch release from: http://glitesoft.cern.ch/EGEE/gLite/).
Generation of the update templates & creation of a new update branch
This needs to be executed in your own working Quattor environment, either at a Quattor server (e.g. stal) or your own laptop. It assumes that you have a working Quattor environment, in particular the definition of an environment variable $L pointing to a usable Quattor repository checkout.
Generating the update templates
First argument of rpmUpdates.pl script is the directory of the mirror created as ndpfmgr before.
- $L/../bin/rpmUpdates.pl /project/quattor/www/html/mirror/glite/3.1/generic/sl4/x86_64/updates/ > /tmp/31-i386
- $L/../bin/rpmUpdates.pl /project/quattor/www/html/mirror/glite/3.1/generic/sl4/x86_64/updates/ > /tmp/31-x86-64
The commands above create Quattor templates that will replace all existing packages (in the Quattor profile) with the most recent versions found in the update repositories. Note that this is a blunt approach that does not take into account packages that were added or deleted as part of an update.
Creating a new update branch
Our Quattor hierarchy contains one directory hierarchy per gLite update. This structure permits to select which gLite updates will be installed per cluster or even per individual node.
- cd $L/cfg
- cp /tmp/glite-3.1-update-31.tpl ./
- cd $L/cfg/grid/glite-3.1/update/
- cp -a 29 31
- find $L/cfg/grid/glite-3.1/update/31 -type d -name CVS -exec rm -rf {} \; (delete the CVS files, this command will change when moving to subversion)
- for file in `grep -H -r "/29/" 31/* |awk -F : '{print $1}'`; do sed -i "s/\/29\//\/31\//g" $file; done (replace the old update nr with the new one).
- replace the entries in $L/cfg/grid/glite-3.1/update/31/{i386,x86_64}/rpms.tpl with entries in glite-3.1-update-31.tpl, take care to uncomment torque packages and that the manualy added packages are appended.
- rm $L/cfg/glite-3.1-update-31.tpl (file copied above, will not pass quattor syntax checks.)
- edit $L/cfg/grid/glite-3.1/glite/defaults.tpl replace update version nr to the actual one (set new update to default version)
- makexprof -u -f itb tbn09 (try to compile it for some testbed node)
- check if the changes for 64 bit are also OK, for example: makexprof -f prd hooi-ei-12
- if OK, submit the changes to CVS
Deployment (and troubleshooting)
Again as ndpfmgr@stal. Get the updates committed as normal user above, confirm they compile, push profile to a testbed node, here are the steps:
- cd $/cfg
- cvs upd -APd
- makexprof -u -f itb tbn09
- pushxprof -u -f itb tbn09
- if OK, push profiles to itb: makexprof -f itb
- pushxprof -f itb
And finaly-A, since Quattor does some kind of package management
If you get noticed, preferably by a monitoring system (if you don't have one -good luck), that something does not work you could try the following:
- Logon the host who shows problems
- tail /var/log/ncm/ncd.log
2008/10/02-15:25:30 [INFO] Errors while configuring spma (1) 2008/10/02-15:25:30 [ERROR] 1 errors, 0 warnings executing configure
This tells us, that we actualy have to look into
- tail /var/log/spma.log
2008/10/02-15:25:28 [WARN] Errors found: depcheck: package glite-UI 3.1.19-0 needs glite-amga-api-python >= 1.3.0-1 depcheck: package glite-UI 3.1.19-0 needs glite-amga-cli >= 1.3.0-4 there were 2 dependency problem(s) and 0 conflict(s)
With this information now simply visit the web page which has the rpm list for this particular server, in our case: http://glite.web.cern.ch/glite/packages/R3.1/deployment/glite-UI/3.1.20-0/glite-UI-3.1.20-0.html
These packages have to be added to the respective rpm list, step wise:
And finaly-B, again as normal-user@stal
Here the "Quattor package managing" for the ui is wrong. Note that you don't add it to the updates directory, this goes to the "base list".
- In our example the following should work:
cat << EOF >> $L/cfg/grid/glite-3.1/glite/ui/rpms.tpl "/software/packages"=pkg_repl("glite-amga-cli","1.3.0-4","i386"); "/software/packages"=pkg_repl("glite-amga-api-python","1.3.0-1","noarch"); EOF
- makexprof -f prd bosui
- If compilation is successful proceed with:
cvs commit
And finaly-C, again as ndpfmgr@stal
Get the updates commited as normal user above, confirm that they compile, push profile to the node that misses packets (here bosui, an UI)
- cd $L/cfg/grid/glite-3.1/glite/ui
- cvs update rpms.tpl
- makexprof -u -f prd bosui
- if OK, pushxprof -u -f prd bosui
- check monitoring system