Difference between revisions of "Agile testbed"

From PDP/Grid Wiki
Jump to navigationJump to search
 
(2 intermediate revisions by one other user not shown)
Line 154: Line 154:
  
 
==== adding a new user to the testbed ====
 
==== adding a new user to the testbed ====
 +
 +
Users are known from their ldap entries. All it takes to allow another user on the testbed is adding their name to
 +
/etc/security/access.conf
 +
on bleek (at least if logging on to bleek is necessary); Adding a home directory on bleek and copying the ssh key of the user to the appropriate file.
 +
 +
Something along these lines (but this is untested):
 +
test -d $NEWUSER || cp -r /etc/skel /user/$NEWUSER
 +
chown -R $NEWUSER:`id -ng $NEWUSER` /user/$NEWUSER
  
 
==== removing a user from the testbed ====
 
==== removing a user from the testbed ====
Line 347: Line 355:
 
== Storage ==
 
== Storage ==
  
The hypervisors of the testbed all connect to the same shared storage backend (known as The Compellent). Some of the nodes through Fibre Channel, the others who don't have that use iSCSI.
+
The hypervisors of the testbed all connect to the same shared storage backend (a Fujitsu DX200 system called KLAAS) over iSCSI.
The Compellent exports two sizable pools to the testbed (and more can be created if needed). These are formatted as LVM groups and shared through a clustered LVM setup.
+
The storage backend exports a number of pools to the testbed. These are formatted as LVM groups and shared through a clustered LVM setup.
  
In libvirt, these VGs are known as 'pools' under the names <code>vmachines</code> and <code>vmachines2</code>.
+
In libvirt, the VG is known as a 'pool' under the name <code>vmachines</code> (location <code>/dev/p4ctb</code>).
  
The older machine with 2TB of disk storage called 'put' is being decommissioned. Remaining VM images will be migrated away.
+
=== Clustered LVM setup ===
  
=== Clustered LVM setup ===
+
The clustering of nodes is provided by corosync. Here are the contents of the configuration file /etc/corosync/corosync.conf:
 +
totem {
 +
version: 2
 +
cluster_name: p4ctb
 +
token: 3000
 +
token_retransmits_before_loss_const: 10
 +
clear_node_high_bit: yes
 +
crypto_cipher: aes256
 +
crypto_hash: sha256
 +
interface {
 +
ringnumber: 0
 +
bindnetaddr: 10.198.0.0
 +
mcastport: 5405
 +
ttl: 1
 +
}
 +
}
 +
 +
logging {
 +
fileline: off
 +
to_stderr: no
 +
to_logfile: no
 +
to_syslog: yes
 +
syslog_facility: daemon
 +
debug: off
 +
timestamp: on
 +
logger_subsys {
 +
subsys: QUORUM
 +
debug: off
 +
}
 +
}
 +
 +
quorum {
 +
provider: corosync_votequorum
 +
expected_votes: 2
 +
}
 +
 
 +
The crypto settings refer to a file /etc/corosync/authkey which must be present on all systems. There is no predefined definition of the cluster, any node can join and that is why the security token is a good idea. You don't want any unexpected members joining the cluster. The quorum of 2 is, of course, because there are only 3 machines at the moment.
  
[https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/High_Availability_Add-On_Overview/ch-cman.html Red Hat's CLVM suite] is a heavyweight high-availability system based on CMAN and depends on fencing of failing nodes. I've never been happy with it, especially since we lack a proper out-of-band way to do fencing from one node to another. After a couple of gruesome evenings kicking this around I found a better solution in [http://pixelchaos.net/2009/04/23/openais-an-alternative-to-clvm-with-cman/ openais]. This is based on building a quorum of nodes in order to do locking; the only processes required are corosync and clvmd.
+
As long as the cluster is quorate everything should be fine. That means that at any time, one of the machines can be maintained, rebooted, etc. without affecting the availability of the storage on the other nodes.
  
'''Do not use the service scripts in /etc/init.d/corosync or /etc/init.d/clvmd!'''
+
As long as at least one node has the cluster up and running, others should be able to join even if the cluster is not quorate. That means that if only a single node out of three is up, the cluster is no longer quorate and storage queries are blocked. But when another node joins the cluster is again quorate and should unblock.
  
Currently there are two pools on the Compellent formatted as VGs with clustered setup. When adding more pools in the future don't forget to set the clustered flag there as well.
 
  
 
==== installation ====
 
==== installation ====
 +
 +
Based on Debian 9.
  
 
Install the required packages:
 
Install the required packages:
  
  apt-get install openais clvmd
+
  apt-get install corosync clvm
  
 
Set up clustered locking in lvm:
 
Set up clustered locking in lvm:
Line 372: Line 417:
 
  sed -i 's/^    locking_type = 1$/    locking_type = 3/' /etc/lvm/lvm.conf
 
  sed -i 's/^    locking_type = 1$/    locking_type = 3/' /etc/lvm/lvm.conf
  
Configure the cluster in /etc/cluster/cluster.conf. All nodes must share the same identical file. This file is currently maintained by [http://bleek.testbed/git/?p=salt.git;a=summary saltstack].
+
Make sure all nodes have the same corosync.conf file and the same authkey. A key can be generated with corosync-keygen.
  
 
==== Running ====
 
==== Running ====
  
Saltstack maintains a custom init.d script to start the corosync and clvmd service in the right way. It's called /etc/init.d/clvm-openais.
+
Start corosync
 +
 
 +
systemctl start corosync
 +
 
 +
Test the cluster status with
  
===== Starting the service manually =====
+
corosync-quorumtool -s
Start corosync:
+
dlm_tool -n ls
  
aisexec
+
Should show all nodes.
  
Start the cluster lvm daemon:
+
Start the iscsi daemon
  
  clvmd -I openais
+
  systemctl start iscsid
 +
systemctl start multipathd
  
Test if everything works by running
+
See if the iscsi paths are visible.
  corosync-quorumtool -s
+
 
and if the cluster is quorate:
+
multipath -ll
  vgdisplay
+
3600000e00d2900000029295000110000 dm-1 FUJITSU,ETERNUS_DXL
 +
size=2.0T features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw
 +
|-+- policy='service-time 0' prio=50 status=active
 +
| |- 6:0:0:1 sdi 8:128 active ready running
 +
  | `- 3:0:0:1 sdg 8:96  active ready running
 +
`-+- policy='service-time 0' prio=10 status=enabled
 +
  |- 4:0:0:1 sdh 8:112 active ready running
 +
  `- 5:0:0:1 sdf 8:80  active ready running
 +
3600000e00d2900000029295000100000 dm-0 FUJITSU,ETERNUS_DXL
 +
size=2.0T features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw
 +
|-+- policy='service-time 0' prio=50 status=active
 +
| |- 4:0:0:0 sdb 8:16  active ready running
 +
| `- 5:0:0:0 sdc 8:32  active ready running
 +
`-+- policy='service-time 0' prio=10 status=enabled
 +
  |- 3:0:0:0 sdd 8:48  active ready running
 +
  `- 6:0:0:0 sde 8:64 active ready running
 +
 
 +
Only then start the clustered lvm.
  
Restarting of the service means killing the corosync and clvmd processes and starting them with the above procedure. The aisexec script actually just starts corosync with an extra environment variable
+
systemctl start lvm2-cluster-activation.service
  
export COROSYNC_DEFAULT_CONFIG_IFACE="openaisserviceenableexperimental:corosync_parser"
 
  
 
==== Troubleshooting ====
 
==== Troubleshooting ====

Latest revision as of 16:59, 6 September 2018