Difference between revisions of "Agile testbed"

From PDP/Grid Wiki
Jump to navigationJump to search
(21 intermediate revisions by one other user not shown)
Line 1: Line 1:
[[Image:P4ctb1.svg|thumb|Diagram of the agile test bed]]
+
[[Image:P4ctb-3.svg|thumb|Diagram of the agile test bed]]
  
 
== Introduction to the Agile testbed ==
 
== Introduction to the Agile testbed ==
 
TODO: integrate the contents of the [[Testbed_Update_Plan]] with this page.
 
  
 
The ''Agile Testbed'' is a collection of virtual machine servers, tuned to quickly setting up virtual machines for testing, certification and experimentation in various configurations.
 
The ''Agile Testbed'' is a collection of virtual machine servers, tuned to quickly setting up virtual machines for testing, certification and experimentation in various configurations.
Line 81: Line 79:
 
==== Installing the machine ====
 
==== Installing the machine ====
  
* '''Choose a VM host''' to starting the installation on. Peruse the [[#Hardware index|hardware inventory]] and pick one of the available machines in the <span style="background-color: #ffc; padding: 2px; border: 1px solid #ccc;">yellow</span> section.
+
* '''Choose a VM host''' to starting the installation on. Peruse the [[#Hardware|hardware inventory]] and pick one of the available machines.
 
* '''Choose a [[#Storage|storage option]]''' for the machine's disk image.
 
* '''Choose a [[#Storage|storage option]]''' for the machine's disk image.
 
* '''Choose OS, memory and disk space''' as needed.
 
* '''Choose OS, memory and disk space''' as needed.
Line 156: Line 154:
  
 
==== adding a new user to the testbed ====
 
==== adding a new user to the testbed ====
 +
 +
Users are known from their ldap entries. All it takes to allow another user on the testbed is adding their name to
 +
/etc/security/access.conf
 +
on bleek (at least if logging on to bleek is necessary); Adding a home directory on bleek and copying the ssh key of the user to the appropriate file.
 +
 +
Something along these lines (but this is untested):
 +
test -d $NEWUSER || cp -r /etc/skel /user/$NEWUSER
 +
chown -R $NEWUSER:`id -ng $NEWUSER` /user/$NEWUSER
  
 
==== removing a user from the testbed ====
 
==== removing a user from the testbed ====
Line 257: Line 263:
 
! ACL
 
! ACL
 
|-
 
|-
| 2
+
| 82
 
| [[NDPF_System_Functions#P4CTB|P4CTB]]
 
| [[NDPF_System_Functions#P4CTB|P4CTB]]
 
| 194.171.96.16/28
 
| 194.171.96.16/28
 
| 194.171.96.30
 
| 194.171.96.30
| br2
+
| br82
 
| No inbound traffic on privileged ports
 
| No inbound traffic on privileged ports
 
|-
 
|-
| 8
+
| 88
 
| [[NDPF_System_Functions#Nordic (Open_Experimental)|Open/Experimental]]
 
| [[NDPF_System_Functions#Nordic (Open_Experimental)|Open/Experimental]]
 
| 194.171.96.32/27
 
| 194.171.96.32/27
 
| 194.171.96.62
 
| 194.171.96.62
| br8
+
| br88
 
| Open
 
| Open
 
|-
 
|-
| 17 (untagged)
+
| 97 (untagged)
 
| local
 
| local
 
| 10.198.0.0/16
 
| 10.198.0.0/16
Line 278: Line 284:
 
| testbed only
 
| testbed only
 
|- style="color: #999;"
 
|- style="color: #999;"
| 4
+
| 84
 
| [[NDPF System Functions#MNGT/IPMI|IPMI and management]]
 
| [[NDPF System Functions#MNGT/IPMI|IPMI and management]]
 
|  172.20.0.0/16
 
|  172.20.0.0/16
Line 290: Line 296:
  
 
  # The primary network interface
 
  # The primary network interface
  auto eth1
+
  auto eth0
  iface eth1 inet manual
+
  iface eth0 inet manual
 +
up ip link set $IFACE mtu 9000
 
   
 
   
 
  auto br0
 
  auto br0
 
  iface br0 inet dhcp
 
  iface br0 inet dhcp
  bridge_ports eth1
+
  bridge_ports eth0
 
   
 
   
  auto br0.2
+
  auto eth0.82
  iface br0.2 inet manual
+
  iface eth0.82 inet manual
  vlan_raw_device br0
+
up ip link set $IFACE mtu 9000
 +
  vlan_raw_device eth0
 
   
 
   
 
  auto br2
 
  auto br2
 
  iface br2 inet manual
 
  iface br2 inet manual
        bridge_ports br0.2
+
bridge_ports eth0.82
 
   
 
   
  auto br2.100
+
  auto eth0.88
  iface br2.100 inet manual
+
  iface eth0.88 inet manual
  vlan_raw_device br2
+
        vlan_raw_device eth0
 +
 +
auto br8
 +
iface br8 inet manual
 +
  bridge_ports eth0.88
 +
 +
auto vlan100
 +
iface vlan100 inet manual
 +
        up ip link set $IFACE mtu 1500
 +
        vlan_raw_device eth0.82
 
   
 
   
 
  auto br2_100
 
  auto br2_100
 
  iface br2_100 inet manual
 
  iface br2_100 inet manual
  bridge_ports br2.100
+
  bridge_ports vlan100
  
In this example VLAN 2 is configured on the first bridge, br0, and this interface is used in a second bridge, br2; the nested VLAN 2.100 is configured on this brigde, and finally a bridge 2_100 is made. So VMs that are added to this brigde will only receive traffic coming from VLAN 100 nested inside VLAN 2.
+
In this example VLAN 82 is configured on the first bridge, br0, and this interface is used in a second bridge, br2; the nested VLAN 82.100 is configured on this brigde, and finally a bridge 2_100 is made. So VMs that are added to this bridge will only receive traffic coming from VLAN 100 nested inside VLAN 2.
  
  
Line 327: Line 344:
 
  SNAT      all  --  10.198.0.0/16        0.0.0.0/0          to:194.171.96.17  
 
  SNAT      all  --  10.198.0.0/16        0.0.0.0/0          to:194.171.96.17  
 
  ...
 
  ...
 +
 +
=== Multicast ===
 +
 +
The systems for clustered LVM and Ganglia rely on multicast to work. Some out of the box Debian installations end up with a host entry like
 +
 +
127.0.1.1 arrone.testbed
 +
 +
These should be removed!
  
 
== Storage ==
 
== Storage ==
  
The hypervisors of the testbed all connect to the same shared storage backend (known as The Compellent). Some of the nodes through Fibre Channel, the others who don't have that use iSCSI.
+
The hypervisors of the testbed all connect to the same shared storage backend (a Fujitsu DX200 system called KLAAS) over iSCSI.
The Compellent exports two sizable pools to the testbed (and more can be created if needed). These are formatted as LVM groups and shared through a clustered LVM setup.
+
The storage backend exports a number of pools to the testbed. These are formatted as LVM groups and shared through a clustered LVM setup.
 +
 
 +
In libvirt, the VG is known as a 'pool' under the name <code>vmachines</code> (location <code>/dev/p4ctb</code>).
 +
 
 +
=== Clustered LVM setup ===
 +
 
 +
The clustering of nodes is provided by corosync. Here are the contents of the configuration file /etc/corosync/corosync.conf:
 +
totem {
 +
version: 2
 +
cluster_name: p4ctb
 +
token: 3000
 +
token_retransmits_before_loss_const: 10
 +
clear_node_high_bit: yes
 +
crypto_cipher: aes256
 +
crypto_hash: sha256
 +
interface {
 +
ringnumber: 0
 +
bindnetaddr: 10.198.0.0
 +
mcastport: 5405
 +
ttl: 1
 +
}
 +
}
 +
 +
logging {
 +
fileline: off
 +
to_stderr: no
 +
to_logfile: no
 +
to_syslog: yes
 +
syslog_facility: daemon
 +
debug: off
 +
timestamp: on
 +
logger_subsys {
 +
subsys: QUORUM
 +
debug: off
 +
}
 +
}
 +
 +
quorum {
 +
provider: corosync_votequorum
 +
expected_votes: 2
 +
}
 +
 
 +
The crypto settings refer to a file /etc/corosync/authkey which must be present on all systems. There is no predefined definition of the cluster, any node can join and that is why the security token is a good idea. You don't want any unexpected members joining the cluster. The quorum of 2 is, of course, because there are only 3 machines at the moment.
  
In libvirt, these VGs are known as 'pools' under the names <code>vmachines</code> and <code>vmachines2</code>.
+
As long as the cluster is quorate everything should be fine. That means that at any time, one of the machines can be maintained, rebooted, etc. without affecting the availability of the storage on the other nodes.
  
The older machine with 2TB of disk storage called 'put' is being decommissioned. Remaining VM images will be migrated away.
+
As long as at least one node has the cluster up and running, others should be able to join even if the cluster is not quorate. That means that if only a single node out of three is up, the cluster is no longer quorate and storage queries are blocked. But when another node joins the cluster is again quorate and should unblock.
  
{| class="wikitable"
+
 
!  || Fibre Channel || iSCSI
+
==== installation ====
|-
+
 
| blade13
+
Based on Debian 9.
|style="background-color: #cfc;"| yes
+
 
|style="background-color: #fcc;"| no
+
Install the required packages:
|-
+
 
| blade14
+
apt-get install corosync clvm
|style="background-color: #cfc;"| yes
+
 
|style="background-color: #fcc;"| no
+
Set up clustered locking in lvm:
|-
+
 
| melkbus
+
sed -i 's/^    locking_type = 1$/    locking_type = 3/' /etc/lvm/lvm.conf
|style="background-color: #cfc;"| yes
+
 
|style="background-color: #fcc;"| no
+
Make sure all nodes have the same corosync.conf file and the same authkey. A key can be generated with corosync-keygen.
|-
+
 
| arrone
+
==== Running ====
|style="background-color: #fcc;"| no
+
 
|style="background-color: #cfc;"| yes
+
Start corosync
|-
+
 
| aulnes
+
systemctl start corosync
|style="background-color: #fcc;"| no
+
 
|style="background-color: #cfc;"| yes
+
Test the cluster status with
|-
+
 
| toom
+
corosync-quorumtool -s
|style="background-color: #fcc;"| no
+
dlm_tool -n ls
|style="background-color: #cfc;"| yes
+
 
|-
+
Should show all nodes.
| span
+
 
|style="background-color: #fcc;"| no
+
Start the iscsi daemon
|style="background-color: #cfc;"| yes
+
 
|}
+
systemctl start iscsid
 +
systemctl start multipathd
 +
 
 +
See if the iscsi paths are visible.
 +
 
 +
multipath -ll
 +
3600000e00d2900000029295000110000 dm-1 FUJITSU,ETERNUS_DXL
 +
size=2.0T features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw
 +
|-+- policy='service-time 0' prio=50 status=active
 +
| |- 6:0:0:1 sdi 8:128 active ready running
 +
| `- 3:0:0:1 sdg 8:96  active ready running
 +
`-+- policy='service-time 0' prio=10 status=enabled
 +
  |- 4:0:0:1 sdh 8:112 active ready running
 +
  `- 5:0:0:1 sdf 8:80  active ready running
 +
3600000e00d2900000029295000100000 dm-0 FUJITSU,ETERNUS_DXL
 +
size=2.0T features='2 queue_if_no_path retain_attached_hw_handler' hwhandler='1 alua' wp=rw
 +
|-+- policy='service-time 0' prio=50 status=active
 +
| |- 4:0:0:0 sdb 8:16  active ready running
 +
| `- 5:0:0:0 sdc 8:32  active ready running
 +
`-+- policy='service-time 0' prio=10 status=enabled
 +
  |- 3:0:0:0 sdd 8:48  active ready running
 +
  `- 6:0:0:0 sde 8:64  active ready running
 +
 
 +
Only then start the clustered lvm.
 +
 
 +
systemctl start lvm2-cluster-activation.service
 +
 
 +
 
 +
==== Troubleshooting ====
 +
 
 +
Cluster log messages are found in /var/log/syslog.
  
 
== Services ==
 
== Services ==
Line 474: Line 571:
 
! disk
 
! disk
 
! [http://www.dell.com/support/ service tag]
 
! [http://www.dell.com/support/ service tag]
 +
! Fibre Channel
 
! location
 
! location
 
! remarks
 
! remarks
 
|-style="background-color: #cfc;"
 
|-style="background-color: #cfc;"
| bleek
 
| bleek
 
| PE1950
 
| Intel 5150  @ 2.66GHz
 
| 2&times;2
 
| align="right"|8GB
 
| CentOS 5
 
| software raid1 2&times;500GB disks
 
| CQ9NK2J
 
| C10
 
| High Availability, dual power supply; precious data; [[#backup|backed up]].
 
|-style="background: #cfc;"
 
| storage
 
| put
 
| PE2950
 
| Intel E5150 @ 2.66GHz
 
| 2&times;2
 
| align="right"|8GB
 
| FreeNAS 8.3
 
| 6&times; 500 GB SATA, raidz (ZFS)
 
| HMXP93J
 
| C03
 
| former garitxako
 
|-style="background-color: #ffc;"
 
 
| blade13
 
| blade13
 
| bl0-13
 
| bl0-13
Line 510: Line 584:
 
| 70 GB + 1 TB Fibre Channel (shared)
 
| 70 GB + 1 TB Fibre Channel (shared)
 
| 5NZWF4J
 
| 5NZWF4J
 +
| yes
 
| C08 blade13
 
| C08 blade13
 
|
 
|
|-style="background-color: #ffc;"
+
|-style="background-color: #cfc;"
 
| blade14
 
| blade14
 
| bl0-14
 
| bl0-14
Line 520: Line 595:
 
| align="right"|16GB
 
| align="right"|16GB
 
| Debian 6, KVM
 
| Debian 6, KVM
| 70 GB + 1 TB Fibre Channel (shared)
+
| 70 GB
 
| 4NZWF4J
 
| 4NZWF4J
 +
| yes
 
| C08 blade14
 
| C08 blade14
 
|
 
|
 +
|-style="background-color: #cfc;"
 +
| melkbus
 +
| bl0-02
 +
| PEM600
 +
| Intel E5450 @3.00GHz
 +
| 2&times;4
 +
| align="right"|32GB
 +
| VMWare ESXi
 +
| 2&times; 320GB SAS disks + 1 TB Fibre Channel (shared)
 +
| 76T974J
 +
| yes
 +
| C08, blade 2
 +
|
 
|-style="background-color: #ffc;"
 
|-style="background-color: #ffc;"
 
| arrone
 
| arrone
Line 534: Line 623:
 
| 70 GB + 400 GB iSCSI (shared)
 
| 70 GB + 400 GB iSCSI (shared)
 
| 982MY2J
 
| 982MY2J
 +
| no
 
| C10
 
| C10
| storage shared with aulnes
+
|  
 
|-style="background-color: #ffc;"
 
|-style="background-color: #ffc;"
 
| aulnes
 
| aulnes
Line 546: Line 636:
 
| 70 GB + 400 GB iSCSI (shared)
 
| 70 GB + 400 GB iSCSI (shared)
 
| B82MY2J
 
| B82MY2J
 +
| no
 
| C10
 
| C10
| storage shared with arrone
+
|  
 
|-style="background-color: #ffc;"
 
|-style="background-color: #ffc;"
 
| toom
 
| toom
Line 558: Line 649:
 
| Hardware raid1 2&times;715GB disks
 
| Hardware raid1 2&times;715GB disks
 
| DC8QG3J
 
| DC8QG3J
 +
| no
 
| C10
 
| C10
| current Xen 3 hypervisor with mktestbed scripts
+
|  
 
|-style="background-color: #ffc;"
 
|-style="background-color: #ffc;"
 
| span
 
| span
Line 570: Line 662:
 
| Hardware raid10 on 4&times;470GB disks (950GB net)
 
| Hardware raid10 on 4&times;470GB disks (950GB net)
 
| FP1BL3J
 
| FP1BL3J
 +
| no
 
| C10
 
| C10
 
| plus [[#Squid|squid proxy]]
 
| plus [[#Squid|squid proxy]]
|-style="background-color: #ffc;"
 
| melkbus
 
| bl0-02
 
| PEM600
 
| Intel E5450 @3.00GHz
 
| 2&times;4
 
| align="right"|32GB
 
| VMWare ESXi
 
| 2&times; 320GB SAS disks + 1 TB Fibre Channel (shared)
 
| 76T974J
 
| C08, blade 2
 
|
 
 
|-style="color: #444;"
 
|-style="color: #444;"
 
| kudde
 
| kudde
Line 596: Line 677:
 
| C10
 
| C10
 
| Contains hardware encryption tokens for robot certificates; managed by Jan Just
 
| Contains hardware encryption tokens for robot certificates; managed by Jan Just
|-
+
|-style="color: #444;"
 +
| storage
 +
| put
 +
| PE2950
 +
| Intel E5150 @ 2.66GHz
 +
| 2&times;2
 +
| align="right"|8GB
 +
| FreeNAS 8.3
 +
| 6&times; 500 GB SATA, raidz (ZFS)
 +
| HMXP93J
 +
| C03
 +
| former garitxako
 +
|- style="color: #444;"
 
| ent
 
| ent
 
| &mdash;
 
| &mdash;
Line 606: Line 699:
 
| SATA 80GB
 
| SATA 80GB
 
| &mdash;
 
| &mdash;
 +
| no
 
| C24
 
| C24
 
| OS X box (no virtualisation)
 
| OS X box (no virtualisation)
 +
|-style="color: #444;"
 +
| ren
 +
| bleek
 +
| PE1950
 +
| Intel 5150  @ 2.66GHz
 +
| 2&times;2
 +
| align="right"|8GB
 +
| CentOS 5
 +
| software raid1 2&times;500GB disks
 +
| 7Q9NK2J
 +
| no
 +
| C10
 +
| High Availability, dual power supply; former bleek
 
|}
 
|}
  
Line 617: Line 724:
 
  ipmi-oem -h host.ipmi.nikhef.nl -u username -p password dell get-system-info service-tag
 
  ipmi-oem -h host.ipmi.nikhef.nl -u username -p password dell get-system-info service-tag
  
Most machines all run [http://www.debian.org/releases/stable/ Debian squeeze] with [http://www.linux-kvm.org/page/Main_Page KVM] for virtualization, managed by [http://libvirt.org/ libvirt].
+
Most machines run [http://www.debian.org/releases/stable/ Debian wheezy] with [http://www.linux-kvm.org/page/Main_Page KVM] for virtualization, managed by [http://libvirt.org/ libvirt].
  
 
See [[NDPF_Node_Functions#P4CTB|the official list]] of machines for the most current view.
 
See [[NDPF_Node_Functions#P4CTB|the official list]] of machines for the most current view.
Line 716: Line 823:
  
 
This just means qemu could not create the domain!
 
This just means qemu could not create the domain!
 +
 +
=== Installing Debian on blades with Fiber Channel ===
 +
 +
Although FC support on Debian works fine, using the multipath-tools-boot package is a bit tricky. It will update the initrd to include the multipath libraries and tools, to make it available at boot time. This happened on blade-13; on reboot it was unable to mount the root partition (The message was 'device or resource busy') because the device mapper had somehow taken hold of the SCSI disk. By changing the root=UUID=xxxx stanza in the GRUB menu to root=/dev/dm-2 (this was guess-work) I managed to boot the system. There were probably several remedies to resolve the issue:
 +
# rerun update-grub. This should replace the UUID= with a link to /dev/mapper/xxxx-part1
 +
# blacklist the disk in the device mapper (and running mkinitramfs)
 +
# remove the multipath-tools-boot package altogether.
 +
 +
I opted for blacklisting; this is what's in /etc/multipath.conf:
 +
blacklist {
 +
  wwid 3600508e000000000d6c6de44c0416105
 +
}
 +
 +
== Migration plans to a cloud infrastructure ==
 +
 +
Previous testbed cloud experiences are [[Agile testbed/Cloud|reported here]].
 +
 +
Currently, using plain libvirt seems to fit most of our needs.

Revision as of 16:59, 6 September 2018