Difference between revisions of "Virtual Machines working group"

From PDP/Grid Wiki
Jump to navigationJump to search
 
(19 intermediate revisions by the same user not shown)
Line 9: Line 9:
  
 
== Meetings ==
 
== Meetings ==
Kick-off - Monday July 6, 2009: [http://www.nikhef.nl/pub/projects/grid/gridwiki/images/5/5f/Agenda06-07-09.txt agenda (dutch)], [http://www.nikhef.nl/pub/projects/grid/gridwiki/images/2/2d/Notulen06-07-09.txt minutes (dutch)]
+
Kick-off - Monday July 6, 2009: [http://www.nikhef.nl/pub/projects/grid/gridwiki/images/5/5f/Agenda06-07-09.txt agenda (dutch)], [http://www.nikhef.nl/pub/projects/grid/gridwiki/images/2/2d/Notulen06-07-09.txt minutes (dutch)], [https://wiki.nbic.nl/index.php/BigGrid_virtualisatie/notulen_20080706 updated agenda/minutes (dutch)]
  
 
== Presentations ==
 
== Presentations ==
Line 84: Line 84:
 
  .../one-1.2.0/src/tm/TransferManager.cc
 
  .../one-1.2.0/src/tm/TransferManager.cc
  
The clone should be made on the local disk for performance reasons and reduction of network load. Only when a clone is saved (requires explicit configuration option), it needs to be transfered back to the image repository. LVM provides the means for this kind of management. Each worker node has a partition /dev/sda3, which is assigned to the logical volume with the Virtual Machine images. The LVM clone is forced to reside on this partition. Note that the partition is not available over iSCSI, so it can only be accessed by the local machine. The Volume Group will show "PV unknown device" and "Couldn't find device with uuid ..." on any other node. When the clone is saved, it is moved from /dev/sda3 to /dev/sdb, which is the iSCSI mounted partition from the image repository. As a result, only saved images are visible in the image repository. The following changes were made to accomplish this behavior:
+
=== Local caching ===
 +
Network traffic for Virtual Machine management can be optimized significantly with two caches on each worker node:
 +
# A read cache for the original Virtual Machine image to facilitate reuse on the same worker node.
 +
# A write-back cache for the copy-on-write clone to allow local writes when the virtual machine is active.
  
Added /dev/sda3 at the end of the clone line in tm_clone.sh:
+
If requested by the user, the copy-on-write clone can be synchronized with the image repository when the virtual machine is done. After this synchronization, the write-back cache becomes obsolete and can be removed. We implemented both the read and the write-back cache at block device level (i.e. iSCSI/LVM level) with [http://github.com/mingzhao/dm-cache/tree/master dm-cache]. One LVM partition on the worker node serves as persistent local read cache for the virtual machine image. Another LVM partition on the worker node serves as transient local write-back cache for the copy-on-write clone. The transient cache is created and removed on demand by OpenNebula.
exec_and_log "ssh $DST_HOST sudo /usr/sbin/lvcreate -s -L$SIZE -n $LV_PATH $SRC_PATH /dev/sda3"
+
<p>
 +
Unfortunately no CentOS or debian packages are available for dm-cache. Here is the recipe to build the kernel module from source.
 +
</p>
  
  Added a line in tm_mv.sh to copy a saved clone back to the image repository:
+
  On debian:
  exec_and_log "ssh $DST_HOST sudo /usr/sbin/pvmove -n $DST_PATH /dev/sda3 /dev/sdb"
+
apt-get install linux-source linux-patch-debian
 +
cd /usr/src
 +
tar jxf linux-source-2.6.26.tar.bz2
 +
/usr/src/kernel-patches/all/2.6.26/apply/debian -a x86_64 -f xen
 +
cd linux-source-2.6.26
 +
cp /boot/config-2.6.26-2-xen-amd64 .config
 +
<In the Makefile: EXTRAVERSION = -2-xen-amd64>
 +
make prepare
 +
cp /usr/src/linux-headers-2.6.26-2-xen-amd64/Module.symvers .
 +
cp -r /usr/src/linux-kbuild-2.6.26/scripts/* scripts
 +
cd
 +
wget http://github.com/mingzhao/dm-cache/tarball/master
 +
tar zxvf dm-cache.tar.gz
 +
cd dm-cache/2.6.29
 +
  ln -s /usr/src/linux-source-2.6.26/drivers/md/dm.h .
 +
ln -s /usr/src/linux-source-2.6.26/drivers/md/dm-bio-list.h .
 +
<In dm-cache.c: change BIO_RW_SYNCIO to BIO_RW_SYNC (line 172)>
 +
<Create [http://www.nikhef.nl/pub/projects/grid/gridwiki/images/b/b6/Makefile Makefile]>
 +
make
 +
insmod dm-cache.ko
 +
cp dm-cache.ko /lib/modules/2.6.26-2-xen-amd64/kernel/drivers/md/dm-cache.ko
  
 
== Other information ==
 
== Other information ==
 
* CERN [http://indico.cern.ch/conferenceDisplay.py?confId=56353 June 2009 workshop] on virtual machines
 
* CERN [http://indico.cern.ch/conferenceDisplay.py?confId=56353 June 2009 workshop] on virtual machines

Latest revision as of 12:40, 5 August 2009

Members and assignment

Sander Klous - Nikhef (Chair)<br\> Ronald Starink - Nikhef<br\> Marc van Driel - NBIC<br\> Pieter van Beek - SARA<br\> Ron Trompert - SARA<br\> <br\> Charge

Meetings

Kick-off - Monday July 6, 2009: agenda (dutch), minutes (dutch), updated agenda/minutes (dutch)

Presentations

Sky computing - Sander@nikhef.nl Klous (Monday July 6, 2009), a summary of the CERN virtual machines workshop (see other information) and an introduction for the kick-off meeting of the BIG grid virtual machines working group.

Open Issues

  • Network Address Translation - What is the load?<br\>
  • Virtual Machine Isolation - Prohibit internal network connectivity with IPTables.<br\>
  • Image repository - Storage Area Network or distributed over worker nodes.<br\>
  • Policy document<br\>

Infrastructure

We are setting up a testbed to investigate technical issues related to virtual machine management.

Hardware and Operating Systems

  • Two Dell 1950 machines, dual CPU, 4 cores per CPU
    • One machine has a CentOS-5 installation
    • One machine has a Debian-squeeze installation

Software

  • CentOS-5 comes with Xen 3.0
  • Debian-squeeze comes with Xen 3.3
    • Debian-squeeze Xen packages have a problem with tap:aio.
Fix:
ln -s /usr/lib/xen-3.2-1/bin/tapdisk /usr/sbin
echo xenblktap >> /etc/modules
  • Opennebula has been installed (stand alone) on CentOS-5 following this guide
    • A few additional staps were needed:
      • Install rubygems and rubygem-sqlite3
      • Opennebula has to be added to the sudoers file for xm and xentop
      • Sudoers should not require a tty
wget ftp://fr.rpmfind.net/linux/EPEL/5/x86_64/rubygem-sqlite3-ruby-1.2.4-1.el5.x86_64.rpm
wget ftp://fr.rpmfind.net/linux/EPEL/5/x86_64/rubygems-1.3.1-1.el5.noarch.rpm
sudo rpm -Uvh rubygems-1.3.1-1.el5.noarch.rpm rubygem-sqlite3-ruby-1.2.4-1.el5.x86_64.rpm 
In /etc/sudoers (on all machines)
opennebula ALL = NOPASSWD: /usr/sbin/xm
opennebula ALL = NOPASSWD: /usr/sbin/xentop
#Defaults    requiretty
  • Installed iSCSI target and client software for shared image repository
  • Image repository consists of LVM volume groups
    • Performance of LVM is better than file based images
    • Each logical volume contains an image
    • This allows easy creation/deletion of new images
    • VMs can run from cloned (Copy-On-Write) images

Implementation issues

Implemented iSCSI image management for opennebula following the storage guide

In /opt/opennebula/etc/oned.conf:
TM_MAD = [
   name       = "tm_iscsi",
   executable = "one_tm",
   arguments  = "tm_iscsi/tm_iscsi.conf",
   default    = "tm_iscsi/tm_iscsi.conf" ]
/opt/opennebula/etc/tm_iscsi/tm_iscsi.conf
/opt/opennebula/etc/tm_iscsi/tm_iscsirc
/opt/opennebula/lib/tm_commands/iscsi/tm_clone.sh
/opt/opennebula/lib/tm_commands/iscsi/tm_delete.sh
/opt/opennebula/lib/tm_commands/iscsi/tm_ln.sh
/opt/opennebula/lib/tm_commands/iscsi/tm_mkimage.sh
/opt/opennebula/lib/tm_commands/iscsi/tm_mkswap.sh
/opt/opennebula/lib/tm_commands/iscsi/tm_mv.sh
.../one-1.2.0/src/vmm/XenDriver.cc
.../one-1.2.0/src/tm/TransferManager.cc

Local caching

Network traffic for Virtual Machine management can be optimized significantly with two caches on each worker node:

  1. A read cache for the original Virtual Machine image to facilitate reuse on the same worker node.
  2. A write-back cache for the copy-on-write clone to allow local writes when the virtual machine is active.

If requested by the user, the copy-on-write clone can be synchronized with the image repository when the virtual machine is done. After this synchronization, the write-back cache becomes obsolete and can be removed. We implemented both the read and the write-back cache at block device level (i.e. iSCSI/LVM level) with dm-cache. One LVM partition on the worker node serves as persistent local read cache for the virtual machine image. Another LVM partition on the worker node serves as transient local write-back cache for the copy-on-write clone. The transient cache is created and removed on demand by OpenNebula.

Unfortunately no CentOS or debian packages are available for dm-cache. Here is the recipe to build the kernel module from source.

On debian:
apt-get install linux-source linux-patch-debian
cd /usr/src
tar jxf linux-source-2.6.26.tar.bz2
/usr/src/kernel-patches/all/2.6.26/apply/debian -a x86_64 -f xen
cd linux-source-2.6.26
cp /boot/config-2.6.26-2-xen-amd64 .config
<In the Makefile: EXTRAVERSION = -2-xen-amd64>
make prepare
cp /usr/src/linux-headers-2.6.26-2-xen-amd64/Module.symvers .
cp -r /usr/src/linux-kbuild-2.6.26/scripts/* scripts
cd
wget http://github.com/mingzhao/dm-cache/tarball/master
tar zxvf dm-cache.tar.gz
cd dm-cache/2.6.29
ln -s /usr/src/linux-source-2.6.26/drivers/md/dm.h .
ln -s /usr/src/linux-source-2.6.26/drivers/md/dm-bio-list.h .
<In dm-cache.c: change BIO_RW_SYNCIO to BIO_RW_SYNC (line 172)>
<Create Makefile>
make
insmod dm-cache.ko
cp dm-cache.ko /lib/modules/2.6.26-2-xen-amd64/kernel/drivers/md/dm-cache.ko

Other information