How to work with our Quattor setup

From PDP/Grid Wiki
Revision as of 15:35, 27 September 2013 by Ronalds@nikhef.nl (talk | contribs) (→‎Overview)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

This article describes how to perform basic and essential tasks with the current setup of Quattor at Nikhef.


This article aims to give an overview of how to work with the Quattor setup used at Nikhef to manage the grid site.

Setup

To work with Quattor, one only needs a machine that supports a Subversion client and a Java development kit (jdk >= 1.5.0). In practice, that means that it is possible to work on Quattor templates and compile them on all current host platforms: Linux, Windows and Mac. The information in this article works fine under Linux (not just the grid machines, but also on laptops); users of Windows and Mac may have to take additional actions, particularly in relation to the build/deploy scripts. Note that in the past it was mandatory to install the build tool ant. However, ant is currently provided by a checkout of the repository. It may still be needed to optimize the ant configuration on your local machine, e.g.:

ANT_OPTS="-Xmx1563m"

in ~/.antrc (on Linux).

Everyone who wants to make changes in the Quattor setup, should make them in his private working area. This working area The working area can be created by checking out from the repository; see section Using the repository for details.

After making the changes in the private working copy, the templates should be compiled. How this is done is described in the section Compiling.

If the compilation was successful, i.e., without errors or warnings, the modified templates can be checked in into the Subversion repository. Note: modified templates should only be checked in if they are to be deployed immediately! Checking in templates but not deploying them could cause surprises for the next person to deploy his changes (well, that's what he might think!).

The deployment phase consists of a checkout on the Quattor server using a dedicated account, followed by a compilation and deployment procedure. The details of this process are given in section Deploying.

Some action on this page require that an environment variable $L exists that points to the conf directory of the Quattor setup, and that the environment variable $PATH includes the directory $L/../bin.

Using the repository

The Quattor setup, comprising the template hierarchy and the SCDB setup, are stored on the Subversion server ndpfsvn.nikhef.nl in repository ndpf. Access to this repository is limited to selected administrators. See the Wiki article on SVN access for details.

Note that although the setup is based on SCDB, Nikhef uses only certain parts from SCDB.

Checking out

Before checking out, ensure there is an ssh agent running that contains the key that is registered on the SVN server. To check out a working copy of the trunk revision under the current working directory, use the following command:

svn co svn+ssh://svn@ndpfsvn.nikhef.nl/repos/ndpf/nl.nikhef.ndpf.quattor-config/trunk

This will create a directory trunk in the working directory, which contains the checked out tree. The file system hierarchy under trunk includes the SCDB setup for Nikhef as well as the Quattor template hierarchy:

trunk +- bin
      +- conf

where bin contains some tools, and where conf is the root of the the configuration setup. The actual templates are present under conf/cfg.

At this point it is a good idea to setup two environment variables (e.g. in .bashrc):

export L=/path/to/trunk/conf
export PATH=$L/../bin:$PATH

After completing the check out and defining the environment, it should be possible to compile templates.

Updating

When a working copy exists, it can be synchronized with the last in the SVN repository. To retrieve the latest files, and merge them with local modifications if possible, the "svn up" command can be executed in the working directory or any of its child directories:

svn up 

This command will update the contents of the working directory and its child directories. If optional directories are specified, only the contents under those directories are updates.

Checking in

Checking in ("committing") the local changes into the SVN repository is done via the "svn commit" command:

svn commit 

The above command will commit all changes made under the current directory. It is also possible to specify which directories and/or files will be committed:

svn commit dir1 dir2 file3

will commit all changes in the hierarchies under dir1 and dir2, as well as in file file3.

Compiling

Compiling the Quattor setup consists of two phases:

1. Syntax checking

2. Translation and validation

The syntax check is performed on all *.tpl and *.pan files under the base directory of the setup. That implies that any file with the file name extension .tpl should use a valid Pan syntax!

During the second phase, the compiler combines selected (valid) template files and performs some validity checks. These checks are defined in the template files and allow for compile-time checks of the values of paths in Pan. If there are no problems, the templates are compiled into an XML file. Assuming that an environment variable $L exists that points to the conf directory, the compiled templates are stored in the directory $L/build/xml.

The shell script makexprof can be used to compile profiles, provided that the directory $L/../bin is part of the environment variable PATH. Usage of the makexprof script:

makexprof [options] -f <facility-name> [hostname1] [hostname2] [...]

The mandatory -f with argument <facility-name> is required to select one of the facility names. Currently defined facilities: itb (Installation Test Bed), prd (Production cluster), opn (Optical Private Network) and generic (miscellaneous Quattor-managed servers). If one or more (object) template names are given, the compilation is only performed for the hosts under the given facility that match the specified names.

The following options can be used:

-d       Enable debug output
-h       Show usage information

Deploying

The deployment phase should always immediately follow the check in of modified templates, not to surprise your colleagues at a later stage with the (rotten?) fruits of your efforts.

Unlike the other steps, the deployment step is executed with the ndpfmgr account on the Quattor server. Only selected persons can login to this account via ssh. It is strongly advised to use ssh key-forwarding when logging in. Currently, Quattor server stal should be used for deploying templates.

Start by updating to the last Quattor templates stored in Subversion:

cd $L/cfg
svn up

The templates should be compiled and the nodes should be triggered to refresh their local copies. This is handled by the command pushxprof (which is in fact a symlink to and supports the same arguments as makexprof):

pushxprof [options] -f <facility-name> [hostname1] [hostname2] [...]

Note that it may be required to use option -u to refresh the repository contents after new packages were added to the repositories. If the compilation step was successful, the newly compiled profiles were created in $L/build/xml and subsequently copied to the directory $L/deploy/xml. The contents of the latter are exposed via a web server to the nodes. Then all involved nodes are informed that a new profile exists. After receiving the notification, the nodes will download their new profiles and start changing the node configuration if needed.

It is possible to use makexprof as user ndpfmgr. However, this will only compile templates (resulting in XML files in $L/build/xml) and those compiled profiles will not be visible to the nodes!

Get the machine running the new template

To (re)install one of the machines:

Login into stal as root (or, as ndpfmgr use sudo /usr/sbin/aii-shellfe).

Do the following commands:


aii-shellfe --configure <hostname>

aii-shellfe --install <hostname>

To restart the machine with ipmi:

ipmitool -I lan -H <hostname>.ipmi.nikhef.nl -U [root|ADMIN] power cycle


To get the machine in rescue mode:

Login into stal as root.

Do the following commands:


aii-shellfe --rescue <hostname>

To restart the machine with ipmi:

ipmitool -I lan -H <hostname>.ipmi.nikhef.nl -U [root|ADMIN] power cycle


You'll get a bootmenu where you can choose which tool you want to use. For example: memtest.

To get the machine in the normal state:

aii-shellfe --boot <hostname>






[1]