Difference between revisions of "Agile testbed"
From PDP/Grid Wiki
Jump to navigationJump to search (move operation procedures forward) |
|||
Line 35: | Line 35: | ||
The user identities on bleek are managed in the [[NDPFDirectoryImplementation|Nikhef central LDAP directory]], as is customary on many of the testbed VMs. The home directories are located on bleek and NFS exported to those VMs that wish to use them (but not to the Virtual hosts, who don't need them). | The user identities on bleek are managed in the [[NDPFDirectoryImplementation|Nikhef central LDAP directory]], as is customary on many of the testbed VMs. The home directories are located on bleek and NFS exported to those VMs that wish to use them (but not to the Virtual hosts, who don't need them). | ||
+ | |||
+ | == Operational procedures == | ||
+ | |||
+ | The testbed is not too tightly managed, but here's an attempt to keep our sanity. | ||
+ | |||
+ | === Server certificates === | ||
+ | |||
+ | Host or server SSL certificates for volatile machines in the testbed are kept on span.nikhef.nl:/var/local/hostkeys. The FQDN of the host determines which CA should be used: | ||
+ | * for *.nikhef.nl, the TERENA eScience SSL CA should be used, | ||
+ | * for *.testbed, the testbed CA should be used. | ||
+ | |||
+ | ==== Generating certificate requests for the TERENA eScience SSL CA ==== | ||
+ | |||
+ | * Go to bleek.nikhef.nl:/var/local/hostkeys/pem/ | ||
+ | * Generate a new request by running ../[https://ndpfsvn.nikhef.nl/cgi-bin/viewvc.cgi/pdpsoft/trunk/agiletestbed/make-terena-req.sh?view=co make-terena-req.sh] ''hostname''. This will create a directory for the hostname with the key and request in it. | ||
+ | * Send the resulting newrequest.csr file to the local registrar (Paul or Elly). | ||
+ | * When the certificate file comes back, install it in /var/local/hostkeys/pem/''hostname''/. | ||
+ | |||
+ | ==== Requesting certificates from the testbed CA ==== | ||
+ | |||
+ | Kindly ask Dennis. The CA key is on his eToken, which means no one else can generate host certificates. Some time in the future this will be replaced by a simple CA setup on the testbed itself. | ||
+ | |||
+ | |||
+ | === Logging of changes === | ||
+ | |||
+ | All changes need to be communicated by e-mail to [mailto:CTB-changelog@nikhef.nl CTB-changelog@nikhef.nl]. | ||
+ | |||
+ | (This replaces the earlier [[CTB Changelog]].) | ||
+ | |||
+ | === adding a new machine === | ||
+ | |||
+ | ==== preparations on bleek ==== | ||
+ | * edit | ||
+ | /etc/hosts | ||
+ | /etc/ethers | ||
+ | to add the new machine, and hardware address. | ||
+ | * Restart dnsmasq | ||
+ | /etc/init.d/dnsmasq restart | ||
+ | * on span.nikhef.nl, run | ||
+ | /usr/local/bin/keygen <hostname> | ||
+ | to pre-generate ssh keys. | ||
+ | * on span, run | ||
+ | /var/local/hostkeys/generate-knownhosts.sh | ||
+ | * on all machines, do | ||
+ | cp /var/local/hostkeys/ssh_known_hosts /etc/ssh/ssh_known_hosts | ||
+ | * (optional) generate or request an X509 host certificate. For local machines in the .testbed domain, Dutchgrid certificates won't be issued, but a testbed-wide CA is in use, ask Dennis. The certificate and key are stored in | ||
+ | /var/local/hostkeys/pem/<hostname>/hostcert.pem | ||
+ | /var/local/hostkeys/pem/<hostname>/hostkey.pem | ||
+ | * place a 'firstboot' script on span in | ||
+ | /var/local/xen/firstboot/<hostname> | ||
+ | (it will be downloaded and run the first time after installation of the machine.) | ||
+ | |||
+ | ==== starting the installation on the dom0 ==== | ||
+ | |||
+ | The old generate-machine script still works for Xen Dom0. For blade13 and blade14 (and other libvirt machines) this script sort of works: | ||
+ | |||
+ | virt-install --name my-new-machine.testbed --ram 1024 --disk pool=vmachines,size=10 \ | ||
+ | --network bridge=br0,mac=52:54:00:48:22:32 --os-type linux --os-variant rhel5.4 \ | ||
+ | --location http://spiegel.nikhef.nl/mirror/centos/5/os/x86_64 \ | ||
+ | --extra ks=http://bleek.nikhef.nl/mktestbed/kickstart/centos5-kvm.ks | ||
+ | |||
+ | Take into account that the mac address is what is configured in /etc/ethers on bleek.nikhef.nl; the disk pool points to the Compellent Fiber Channel pool. | ||
+ | |||
+ | |||
+ | A debian installation is similar, but uses preseeding instead of kickstart. The preseed configuration is on bleek. | ||
+ | |||
+ | virt-install --name debian-builder.testbed --ram 1024 --disk pool=vmachines,size=10 \ | ||
+ | --network bridge=br0,mac=52:54:00:fc:18:dd --os-type linux --os-variant debiansqueeze \ | ||
+ | --location http://ftp.nl.debian.org/debian/dists/squeeze/main/installer-amd64/ \ | ||
+ | --extra 'auto=true priority=critical url=http://bleek.testbed/d-i/squeeze/preseed.cfg' | ||
+ | |||
+ | A few notes: | ||
+ | * The network autoconfiguration seems to happen too soon; the host bridge configuration doesn't pass the DHCP packets for a while after creating the domain which systematically causes the Debian installer to complain. Fortunately the configuration can be retried from the menu. The second time around the configuration is ok. | ||
+ | * Alternative installation kickstart files are available; e.g. http://bleek.nikhef.nl/mktestbed/kickstart/centos6-64-kvm.ks for CentOS 6. | ||
+ | ** With Debian preseeding, this may be automated by either setting <tt>d-i netcfg/dhcp_options select Retry network autoconfiguration</tt> or <tt>d-i netcfg/dchp_timeout string 60</tt>. | ||
+ | * Sometimes, a storage device is re-used (especially when recreating a domain after removing it '''and''' the associated storage). The re-use may cause the partitioner to see an existing LVM definition and fail, complaining that the partition already exists; you can re-use an existing LVM volume by using the argument: <tt>--disk vol=vmachines/blah</tt>. | ||
+ | |||
+ | ==== Automatic configuration of machines ==== | ||
+ | |||
+ | The default kickstart scripts for testbed VMs will download a 'firstboot' script at the end of the install cycle, based on the name they've been given by DHCP. Look in span.nikhef.nl:/usr/local/mktestbed/firstboot for the files that are used, but be aware that these are managed with git (gitosis on span). | ||
+ | |||
+ | === Configuration of LDAP authentication === | ||
+ | |||
+ | ==== Fedora Core 14 ==== | ||
+ | |||
+ | The machine fc14.testbed is configured for LDAP authn against ldap.nikhef.nl. Some notes: | ||
+ | * /etc/nslcd.conf: | ||
+ | uri ldaps://ldap.nikhef.nl ldaps://hooimijt.nikhef.nl | ||
+ | base dc=farmnet,dc=nikhef,dc=nl | ||
+ | ssl on | ||
+ | tls_cacertdir /etc/openldap/cacerts | ||
+ | |||
+ | * /etc/openldap/cacerts is symlinked to /etc/grid-security/certificates. | ||
+ | |||
+ | ==== Debian 'Squeeze' ==== | ||
+ | |||
+ | Debian is a bit different; the nslcd daemon is linked against GnuTLS instead of OpenSSL. Due to a bug (so it would seem) one cannot simply point to a directory of certificates. Debian provides a script to collect all the certificates in one big file. Here is the short short procedure: | ||
+ | |||
+ | mkdir /usr/share/ca-certificates/igtf | ||
+ | for i in /etc/grid-security/certificates/*.0 ; do ln -s $i /usr/share/ca-certificates/igtf/`basename $i`.crt; done | ||
+ | update-ca-certificates | ||
+ | |||
+ | The file /etc/nsswitch.conf needs to include these lines to use ldap: | ||
+ | passwd: compat ldap | ||
+ | group: compat ldap | ||
+ | |||
+ | This file can be used in /etc/nslcd.conf: | ||
+ | uid nslcd | ||
+ | gid nslcd | ||
+ | base dc=farmnet,dc=nikhef,dc=nl | ||
+ | ldap_version 3 | ||
+ | ssl on | ||
+ | uri ldaps://ldap.nikhef.nl ldaps://hooimijt.nikhef.nl | ||
+ | tls_cacertfile /etc/ssl/certs/ca-certificates.crt | ||
+ | timelimit 120 | ||
+ | bind_timelimit 120 | ||
+ | nss_initgroups_ignoreusers root | ||
+ | |||
+ | The libpam-ldap package needs to be installed as well, with the following in /etc/pam_ldap.conf | ||
+ | base dc=farmnet,dc=nikhef,dc=nl | ||
+ | uri ldaps://ldap.nikhef.nl/ ldaps://hooimijt.nikhef.nl/ | ||
+ | ldap_version 3 | ||
+ | pam_password md5 | ||
+ | ssl on | ||
+ | tls_cacertfile /etc/ssl/certs/ca-certificates.crt | ||
+ | |||
+ | At least this seems to work. | ||
== Network == | == Network == |