Difference between revisions of "Virtual Machines working group"

From PDP/Grid Wiki
Jump to navigationJump to search
 
(76 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
== Members and assignment ==
 +
Sander Klous - Nikhef (Chair)<br\>
 +
Ronald Starink - Nikhef<br\>
 +
Marc van Driel - NBIC<br\>
 +
Pieter van Beek - SARA<br\>
 +
Ron Trompert - SARA<br\>
 +
<br\>
 +
[http://www.nikhef.nl/pub/projects/grid/gridwiki/images/4/4b/Assignment.pdf Charge]
 +
 
== Meetings ==
 
== Meetings ==
Kick-off <date> agenda minutes
+
Kick-off - Monday July 6, 2009: [http://www.nikhef.nl/pub/projects/grid/gridwiki/images/5/5f/Agenda06-07-09.txt agenda (dutch)], [http://www.nikhef.nl/pub/projects/grid/gridwiki/images/2/2d/Notulen06-07-09.txt minutes (dutch)], [https://wiki.nbic.nl/index.php/BigGrid_virtualisatie/notulen_20080706 updated agenda/minutes (dutch)]
 +
 
 
== Presentations ==
 
== Presentations ==
 +
[http://www.nikhef.nl/pub/projects/grid/gridwiki/images/2/2d/Sky_computing.pdf Sky computing] - Sander@nikhef.nl Klous (Monday July 6, 2009), a summary of the CERN virtual machines workshop (see other information) and an introduction for the kick-off meeting of the BIG grid virtual machines working group.
 +
 
== Open Issues ==
 
== Open Issues ==
 
* Network Address Translation - What is the load?<br\>
 
* Network Address Translation - What is the load?<br\>
 
* Virtual Machine Isolation - Prohibit internal network connectivity with IPTables.<br\>
 
* Virtual Machine Isolation - Prohibit internal network connectivity with IPTables.<br\>
  * Image repository - Storage Area Network or distributed over worker nodes.<br\>
+
* Image repository - Storage Area Network or distributed over worker nodes.<br\>
  * Policy document<br\>
+
* Policy document<br\>
 +
== Infrastructure ==
 +
We are setting up a testbed to investigate technical issues related to virtual machine management.
 +
=== Hardware and Operating Systems ===
 +
* Two Dell 1950 machines, dual CPU, 4 cores per CPU
 +
** One machine has a CentOS-5 installation
 +
** One machine has a Debian-squeeze installation
 +
 
 +
=== Software ===
 +
* CentOS-5 comes with Xen 3.0
 +
* Debian-squeeze comes with Xen 3.3
 +
** Debian-squeeze Xen packages have a [http://lists.alioth.debian.org/pipermail/pkg-xen-devel/2009-June/002344.html problem] with tap:aio.
 +
Fix:
 +
ln -s /usr/lib/xen-3.2-1/bin/tapdisk /usr/sbin
 +
echo xenblktap >> /etc/modules
 +
* Opennebula has been installed (stand alone) on CentOS-5 following [http://www.opennebula.org/doku.php?id=documentation:rel1.2:qg this] guide
 +
** A few additional staps were needed:
 +
*** Install rubygems and rubygem-sqlite3
 +
*** Opennebula has to be added to the sudoers file for xm and xentop
 +
*** Sudoers should not require a tty
 +
wget ftp://fr.rpmfind.net/linux/EPEL/5/x86_64/rubygem-sqlite3-ruby-1.2.4-1.el5.x86_64.rpm
 +
wget ftp://fr.rpmfind.net/linux/EPEL/5/x86_64/rubygems-1.3.1-1.el5.noarch.rpm
 +
sudo rpm -Uvh rubygems-1.3.1-1.el5.noarch.rpm rubygem-sqlite3-ruby-1.2.4-1.el5.x86_64.rpm
 +
 
 +
In /etc/sudoers (on all machines)
 +
opennebula ALL = NOPASSWD: /usr/sbin/xm
 +
opennebula ALL = NOPASSWD: /usr/sbin/xentop
 +
#Defaults    requiretty
 +
 
 +
* Installed iSCSI target and client software for shared image repository
 +
** Howtos: [http://www.howtoforge.com/using-iscsi-on-debian-lenny-initiator-and-target Debian client/server], [http://www.cyberciti.biz/tips/rhel-centos-fedora-linux-iscsi-howto.html CentOS client], [http://www.cyberciti.biz/tips/howto-setup-linux-iscsi-target-sanwith-tgt.html CentOS server]
 +
** Maybe test later with encrypted iSCSI
 +
** Two new machines ordered with the required iSCSI offload
 +
* Image repository consists of LVM volume groups
 +
** Performance of LVM is better than file based images
 +
** Each logical volume contains an image
 +
** This allows easy creation/deletion of new images
 +
** VMs can run from cloned (Copy-On-Write) images
 +
 
 +
== Implementation issues ==
 +
Implemented iSCSI image management for opennebula following the [http://www.opennebula.org/doku.php?id=documentation:rel1.2:sm storage guide]
 +
* Changed oned configuration as shown below
 +
* Added [http://www.nikhef.nl/pub/projects/grid/gridwiki/images/1/12/Tm_iscsi.tar.gz tm_iscsi configuration]
 +
* Implemented [http://www.nikhef.nl/pub/projects/grid/gridwiki/images/1/10/Iscsi.tar.gz transfer manager commands]
 +
* Modified [http://www.nikhef.nl/pub/projects/grid/gridwiki/images/1/15/One-1.2.0.tar.gz XenDriver.cc and TransferManager.cc] to support LVM images
 +
 
 +
In /opt/opennebula/etc/oned.conf:
 +
TM_MAD = [
 +
    name      = "tm_iscsi",
 +
    executable = "one_tm",
 +
    arguments  = "tm_iscsi/tm_iscsi.conf",
 +
    default    = "tm_iscsi/tm_iscsi.conf" ]
 +
 
 +
/opt/opennebula/etc/tm_iscsi/tm_iscsi.conf
 +
/opt/opennebula/etc/tm_iscsi/tm_iscsirc
 +
 
 +
/opt/opennebula/lib/tm_commands/iscsi/tm_clone.sh
 +
/opt/opennebula/lib/tm_commands/iscsi/tm_delete.sh
 +
/opt/opennebula/lib/tm_commands/iscsi/tm_ln.sh
 +
/opt/opennebula/lib/tm_commands/iscsi/tm_mkimage.sh
 +
/opt/opennebula/lib/tm_commands/iscsi/tm_mkswap.sh
 +
/opt/opennebula/lib/tm_commands/iscsi/tm_mv.sh
 +
 
 +
.../one-1.2.0/src/vmm/XenDriver.cc
 +
.../one-1.2.0/src/tm/TransferManager.cc
 +
 
 +
=== Local caching ===
 +
Network traffic for Virtual Machine management can be optimized significantly with two caches on each worker node:
 +
# A read cache for the original Virtual Machine image to facilitate reuse on the same worker node.
 +
# A write-back cache for the copy-on-write clone to allow local writes when the virtual machine is active.
 +
 
 +
If requested by the user, the copy-on-write clone can be synchronized with the image repository when the virtual machine is done. After this synchronization, the write-back cache becomes obsolete and can be removed. We implemented both the read and the write-back cache at block device level (i.e. iSCSI/LVM level) with [http://github.com/mingzhao/dm-cache/tree/master dm-cache]. One LVM partition on the worker node serves as persistent local read cache for the virtual machine image. Another LVM partition on the worker node serves as transient local write-back cache for the copy-on-write clone. The transient cache is created and removed on demand by OpenNebula.
 +
<p>
 +
Unfortunately no CentOS or debian packages are available for dm-cache. Here is the recipe to build the kernel module from source.
 +
</p>
 +
 
 +
On debian:
 +
apt-get install linux-source linux-patch-debian
 +
cd /usr/src
 +
tar jxf linux-source-2.6.26.tar.bz2
 +
/usr/src/kernel-patches/all/2.6.26/apply/debian -a x86_64 -f xen
 +
cd linux-source-2.6.26
 +
cp /boot/config-2.6.26-2-xen-amd64 .config
 +
<In the Makefile: EXTRAVERSION = -2-xen-amd64>
 +
make prepare
 +
cp /usr/src/linux-headers-2.6.26-2-xen-amd64/Module.symvers .
 +
cp -r /usr/src/linux-kbuild-2.6.26/scripts/* scripts
 +
cd
 +
wget http://github.com/mingzhao/dm-cache/tarball/master
 +
tar zxvf dm-cache.tar.gz
 +
cd dm-cache/2.6.29
 +
ln -s /usr/src/linux-source-2.6.26/drivers/md/dm.h .
 +
ln -s /usr/src/linux-source-2.6.26/drivers/md/dm-bio-list.h .
 +
<In dm-cache.c: change BIO_RW_SYNCIO to BIO_RW_SYNC (line 172)>
 +
<Create [http://www.nikhef.nl/pub/projects/grid/gridwiki/images/b/b6/Makefile Makefile]>
 +
make
 +
insmod dm-cache.ko
 +
cp dm-cache.ko /lib/modules/2.6.26-2-xen-amd64/kernel/drivers/md/dm-cache.ko
 +
 
 +
== Other information ==
 +
* CERN [http://indico.cern.ch/conferenceDisplay.py?confId=56353 June 2009 workshop] on virtual machines

Latest revision as of 12:40, 5 August 2009

Members and assignment

Sander Klous - Nikhef (Chair)<br\> Ronald Starink - Nikhef<br\> Marc van Driel - NBIC<br\> Pieter van Beek - SARA<br\> Ron Trompert - SARA<br\> <br\> Charge

Meetings

Kick-off - Monday July 6, 2009: agenda (dutch), minutes (dutch), updated agenda/minutes (dutch)

Presentations

Sky computing - Sander@nikhef.nl Klous (Monday July 6, 2009), a summary of the CERN virtual machines workshop (see other information) and an introduction for the kick-off meeting of the BIG grid virtual machines working group.

Open Issues

  • Network Address Translation - What is the load?<br\>
  • Virtual Machine Isolation - Prohibit internal network connectivity with IPTables.<br\>
  • Image repository - Storage Area Network or distributed over worker nodes.<br\>
  • Policy document<br\>

Infrastructure

We are setting up a testbed to investigate technical issues related to virtual machine management.

Hardware and Operating Systems

  • Two Dell 1950 machines, dual CPU, 4 cores per CPU
    • One machine has a CentOS-5 installation
    • One machine has a Debian-squeeze installation

Software

  • CentOS-5 comes with Xen 3.0
  • Debian-squeeze comes with Xen 3.3
    • Debian-squeeze Xen packages have a problem with tap:aio.
Fix:
ln -s /usr/lib/xen-3.2-1/bin/tapdisk /usr/sbin
echo xenblktap >> /etc/modules
  • Opennebula has been installed (stand alone) on CentOS-5 following this guide
    • A few additional staps were needed:
      • Install rubygems and rubygem-sqlite3
      • Opennebula has to be added to the sudoers file for xm and xentop
      • Sudoers should not require a tty
wget ftp://fr.rpmfind.net/linux/EPEL/5/x86_64/rubygem-sqlite3-ruby-1.2.4-1.el5.x86_64.rpm
wget ftp://fr.rpmfind.net/linux/EPEL/5/x86_64/rubygems-1.3.1-1.el5.noarch.rpm
sudo rpm -Uvh rubygems-1.3.1-1.el5.noarch.rpm rubygem-sqlite3-ruby-1.2.4-1.el5.x86_64.rpm 
In /etc/sudoers (on all machines)
opennebula ALL = NOPASSWD: /usr/sbin/xm
opennebula ALL = NOPASSWD: /usr/sbin/xentop
#Defaults    requiretty
  • Installed iSCSI target and client software for shared image repository
  • Image repository consists of LVM volume groups
    • Performance of LVM is better than file based images
    • Each logical volume contains an image
    • This allows easy creation/deletion of new images
    • VMs can run from cloned (Copy-On-Write) images

Implementation issues

Implemented iSCSI image management for opennebula following the storage guide

In /opt/opennebula/etc/oned.conf:
TM_MAD = [
   name       = "tm_iscsi",
   executable = "one_tm",
   arguments  = "tm_iscsi/tm_iscsi.conf",
   default    = "tm_iscsi/tm_iscsi.conf" ]
/opt/opennebula/etc/tm_iscsi/tm_iscsi.conf
/opt/opennebula/etc/tm_iscsi/tm_iscsirc
/opt/opennebula/lib/tm_commands/iscsi/tm_clone.sh
/opt/opennebula/lib/tm_commands/iscsi/tm_delete.sh
/opt/opennebula/lib/tm_commands/iscsi/tm_ln.sh
/opt/opennebula/lib/tm_commands/iscsi/tm_mkimage.sh
/opt/opennebula/lib/tm_commands/iscsi/tm_mkswap.sh
/opt/opennebula/lib/tm_commands/iscsi/tm_mv.sh
.../one-1.2.0/src/vmm/XenDriver.cc
.../one-1.2.0/src/tm/TransferManager.cc

Local caching

Network traffic for Virtual Machine management can be optimized significantly with two caches on each worker node:

  1. A read cache for the original Virtual Machine image to facilitate reuse on the same worker node.
  2. A write-back cache for the copy-on-write clone to allow local writes when the virtual machine is active.

If requested by the user, the copy-on-write clone can be synchronized with the image repository when the virtual machine is done. After this synchronization, the write-back cache becomes obsolete and can be removed. We implemented both the read and the write-back cache at block device level (i.e. iSCSI/LVM level) with dm-cache. One LVM partition on the worker node serves as persistent local read cache for the virtual machine image. Another LVM partition on the worker node serves as transient local write-back cache for the copy-on-write clone. The transient cache is created and removed on demand by OpenNebula.

Unfortunately no CentOS or debian packages are available for dm-cache. Here is the recipe to build the kernel module from source.

On debian:
apt-get install linux-source linux-patch-debian
cd /usr/src
tar jxf linux-source-2.6.26.tar.bz2
/usr/src/kernel-patches/all/2.6.26/apply/debian -a x86_64 -f xen
cd linux-source-2.6.26
cp /boot/config-2.6.26-2-xen-amd64 .config
<In the Makefile: EXTRAVERSION = -2-xen-amd64>
make prepare
cp /usr/src/linux-headers-2.6.26-2-xen-amd64/Module.symvers .
cp -r /usr/src/linux-kbuild-2.6.26/scripts/* scripts
cd
wget http://github.com/mingzhao/dm-cache/tarball/master
tar zxvf dm-cache.tar.gz
cd dm-cache/2.6.29
ln -s /usr/src/linux-source-2.6.26/drivers/md/dm.h .
ln -s /usr/src/linux-source-2.6.26/drivers/md/dm-bio-list.h .
<In dm-cache.c: change BIO_RW_SYNCIO to BIO_RW_SYNC (line 172)>
<Create Makefile>
make
insmod dm-cache.ko
cp dm-cache.ko /lib/modules/2.6.26-2-xen-amd64/kernel/drivers/md/dm-cache.ko

Other information