Tips on Using Ganeti to Manage a KVM based Virtual Machine Cluster on Ubunty Jaunty Jackelope 9.04

Update:  I apologize for not updating this post.  I struggled with this for quite a while before making real progress, which I’ll try to detail.  A few key points:

  1. debootstrap doesn’t install a bootloader, so even if you are using kvm, you need to specify a kernel on the parent/host and a root disk device (on in the vm) as part of the config.   Make sure that the kernel matches the modules installed by debootstrap, or you’ll have lots of other problems.
  2. The default use of virtio for the disk interface causes problems with the kvm version that ships with ubuntu.  The virtual machines bios may not detect it.  Specify IDE for less hassle.

I’ve hacked up the ganeti-os-debootstrap scripts to use ubuntu’s vmbuilder script to create ubuntu VMs that do have a boot loader.  I need to do a little cleanup and then I’ll share my work.

————————————

We are using a number of virtual machines to support the efforts at work.  We’ve been running these on VMWare server on some Linux servers for the past year, but I’m looking at moving on from there to something that is based on more open software.   I wanted to share some of the reasons behind the choices I made, and how I got over some of the obstacles I encountered with my choice of Ubuntu Jaunty Jackelope (9.04) for my OS, KVM for virtualization and Ganeti to manage the virtual machines.  This won’t be exhaustive, but hopefully it will help other people.

I’d been eying Ganeti, a package for managing multiple Xen or KVM virtual machines running on a cluster of hosts.  I was particularly intrigued because Ganeti went so far as managing redundant storage via DRBD.  Still, I took a look at Eucalyptus because it implements significant portions of the Amazon Web Services API for provisioning system instances and perisistent storage.  I was even more intrigued when I discovered that they supported both S3 (a key-value store) and EBS (a block-based storage layer).  I ended up choosing Ganeti though.  Eucalyptus required me to configure a shared highly available storage layer, something that Ganeti largely handled for me.  More importantly a limitation in some of the software they integrated to provide EBS, meant that I couldn’t run instances that used EBS volumes on the same machine that was providing the EBS storage service, which wasn’t acceptable for the small 2-4 host cluster I planned on building.

I also had the choice of the Xen or LVM hypervisors.  I chose LVM because it is supposed to be better supported by Ubuntu, and, in the long run, looks like it will become the favored choice of Redhat as well since.

Installing Ganeti:

There is a version of Ganeti packaged for Ubuntu, but it is an older version that doesn’t support the features that most interest me that are only available in v2.0, so I I worked from the Ganeti 2.0 installation document.  I ran into a few problems because it is skewed towards using the Xen hypervisor and Debian, while I wanted to use the KVM hypervisor on Ubuntu.

The first issue I hit was in trying to install the DRBD prerequisite.  DRBD mirrors block devices over the network, providing an important piece of the fault tolerance and high availability puzzle that Ganeti builds on.  Ganeti requires a more recent version of DRBD.  Earlier versions of Ubuntu and Debian package this version, but Jaunty only has a package for an earlier version of DRBD.  Stranger still, it has utilities for managing the more recent version.  With a little digging, I found that the modules for DRBD8 are actually packaged with the server kernel.  So, my first problem was no problem at all.

Initializing and Running Ganeti:

The next issue I hit is with changes Jaunty made to the default python path, and the fact that the implications of those changes hadn’t propagated everywhere they needed to go.  The result is that once I installed Ganeti, I got a python import error when trying to run ‘gnt-cluster init.’ My solution was to move ‘ganeti’ from  ‘site-packages’ to ‘dist-packages.’

The next problem I ran into is that I wasn’t using Xen.  I knew enough the first time through not to bother creating symlinks for a Xen instance kernel, but I didn’t really know what to do instead.  In trying to figure that out, I realized that I should have specified that the default hypervisor be kvm when I initialized the cluster.  Even though Xen wasn’t installed, it defaulted to Xen.  So, I had to destroy the cluster and initialize a new one:

gnt-cluster init --default-hypervisor=kvm myclustername

Default Kernel for New Instances:

When I first tried to create a new instance, I got this:

gnt-instance add -t plain -s1G -n vmhost3 -o debootstrap vm1.office.alki.comFailure: command execution error:

Hypervisor parameter validation failed on node vmhost3.office.alki.com: Instance kernel '/boot/vmlinuz-2.6-kvmU' not found or not a fi

# gnt-instance add -t plain -s1G -n vmhost3 -o debootstrap vm1.ournet.net
Failure: command execution error:
Hypervisor parameter validation failed on node vmhost3: Instance kernel '/boot/vmlinuz-2.6-kvmU' not found or not a file

It looks like the solution to this problem is to adapt the instructions for creating symlinks for a Xen instance kernel, and link /boot/vmlinuz-2.6-kvmU to my current server kernel.  I have a feeling that I’ll be using a more stripped down kernel once I figure out how this all fits together.

OS Support Files for Creating New Instances:

The ganeti install cover installing the OS support files, but it seems the default configuration option puts the files in ‘/usr/local/share/ganeti/os,’ rather than ‘/srv/ganeti/os.’ The README that comes with the support files suggests more appropriate configuration options:

./configure --prefix=/usr --localstatedir=/var \
    --sysconfdir=/etc \
    --with-os-dir=/srv/ganeti/os
  make && make install

That seems to do the trick:  gnt-os list includes debootstrap, and creating a new instance seems to work as expected and I type this, it seems to be starting up!

Connecting a Console to a Running Instance:

When I first ran ‘gnt-instance console instancename’ I got an error that /usr/bin/socat was missing.  Installing it with ‘aptitude install socat’ but the console doesn’t seem responsive, and a kvm process has been using 100% of one core for about 5 minutes now.

Accessing an Instances Disks:

As part of my debugging, I wanted to try to access the disk image of the instance to see if the log files showed anything.  This was a challenge in and of itself.  From the Ganeti documentation, I thought that running ‘gnt-instance activate-disks instancename’ would give me the name of a device I could mount, but doing so generated an error”

mount: wrong fs type, bad option, bad superblock on /dev/mapper/xenvg-a505f631--72fe--4100--a7e5--b3efae6d8082.disk0,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so

A little digging and I learned that the virtual disk actually had partitions, which needed to be mapped before I could mount the partition.

# gnt-instance activate-disks vm1
vmhost3.office.alki.com:disk/0:/dev/xenvg/a505f631-72fe-4100-a7e5-b3efae6d8082.disk0
# kpartx -av /dev/xenvg/a505f631-72fe-4100-a7e5-b3efae6d8082.disk0
add map xenvg-a505f631--72fe--4100--a7e5--b3efae6d8082.disk0p1 (252:7): 0 2088449 linear /dev/xenvg/a505f631-72fe-4100-a7e5-b3efae6d8082.disk0 1
# mount -t ext3 /dev/mapper/xenvg-a505f631--72fe--4100--a7e5--b3efae6d8082.disk0p1 /mnt/kvm-image
# ls /mnt/kvm-image/
bin  boot  dev  etc  home  lib  lost+found  media  mnt  opt  proc  root  sbin  selinux  srv  sys  tmp  usr  var

From checking the log directory, it is clear that whatever is going on, it’s never getting to the point where it can write to a log file.

Hmmmm, maybe it has something to do with the fact that there is no kernel or initrd?  Could that be it, maybe? Hmmm.

UPDATE: As of this writing, I still don’t have an instance running successfully.  I’m going to spend a little more time trying to get it to work and then probably cut bait in use the basic vm management tools ubuntu provides.

The Ganeti community seems pretty thin. The Google group has had undealt-with spam for the last few days, and an appeal for help I posted hasn’t drawn any response.  I found an IRC group on Freenode, but there are only two other people in it, and they may well be dead.  It’s too bad, because it seems like cool software.  I guess the other option is to try using Xen instead of KVM, and/or try using the packaged version in the universe repository.

6 thoughts on “Tips on Using Ganeti to Manage a KVM based Virtual Machine Cluster on Ubunty Jaunty Jackelope 9.04

  1. Bob

    OK, I beat my head against this thing for a couple of days, and I think figured out what the essential problem is: Ganeti expects kvm to use virtio disks, and kvm isn’t happy about that. Now, I made dozens of changes and reversions and false starts in getting to the point that the ganeti instance would boot, so I could be forgetting some other essential issue here, but I think that the following two changes beyond what you’ve already done will work:

    First, after you first initialize your cluster, execute the command:

    gnt-cluster modify –hypervisor-parameters kvm:root_path=/dev/sda1

    Then use the following command to create your instance:

    gnt-instance add -t plain -s1G -n -H kvm:disk_type=ide -o debootstrap

    Disclosure: By the time I got to the point of making this work, I’d given up on the stock debootstrap definitions and had built a version to bootstrap a Jaunty server instance, so I haven’t tested this with the stock lenny defs. FWIW, I found this helpful in building my definitions:

    http://www.psychocats.net/ubuntucat/creating-a-passwordless-account-in-ubuntu/

    Thanks for your post, it definitely saved me a bunch of time.

  2. bithive

    I just spent two days playing with ganeti and never once got an instance to boot under Xen or KVM. I was really hoping the information on this page would make the difference but it did not.

    I’m giving up on ganeti for a year or two until the documentation adds sections about “what to do if things don’t work perfectly”, as currently there are none.

  3. Heriyono Sim

    These tips below will help you tremendously. Please drop a message if so.

    Follow: http://ganeti-doc.googlecode.com/svn/ganeti-2.1/html/install-quick.html
    Follow: http://ganeti-doc.googlecode.com/svn/ganeti-2.1/html/install.html
    Follow: http://geekfun.com/2009/07/14/tips-on-using-ganeti-to-manage-a-kvm-based-virtual-machine-cluster-on-ubunty-jaunty-jackelope-9-04/

    For debootstrap:
    Use:
    ./configure –prefix=/usr –localstatedir=/var \
    –sysconfdir=/etc \
    –with-os-dir=/srv/ganeti/os
    make && make install

    ln -s vmlinuz-2.6.24-11-pve /boot/kernel-2.6-kvmU
    ln -s initrd.img-2.6.24-11-pve /boot/initrd-2.6-kvmU
    gnt-cluster modify -H kvm:initrd_path=/boot/kernel-2.6-kvmU
    gnt-cluster modify -H kvm:initrd_path=/boot/initrd-2.6-kvmU
    gnt-cluster modify -H kvm:vnc_bind_address = 127.0.0.1
    gnt-cluster info

    gnt-instance add -t plain -s 1G -n qrm.kosmos.sg -o debootstrap+default mail.kosmos.sg
    grep -Af | grep kvm
    Check to see correct arguments…
    /usr/bin/kvm -name mail.kosmos.sg -m 128 -smp 1 -pidfile /var/run/ganeti/kvm-hypervisor/pid/mail.kosmos.sg -daemonize -boot c -drive file=/var/run/ganeti/instance-disks/mail.kosmos.sg:0,format=raw,if=virtio,boot=on -kernel /boot/vmlinuz-2.6-kvmU -initrd /boot/initrd-2.6-kvmU -append “root=/dev/vda1 ro” -vnc 127.0.0.1:3 -usbdevice tablet -serial unix:/var/run/ganeti/kvm-hypervisor/ctrl/mail.kosmos.sg.serial,server,nowait -net nic,vlan=0,macaddr=aa:00:00:df:af:51,model=virtio

    Note: By default virtio is used (most distro’s initrd has built-in support for this).
    Note: Default debootstrap does not setup network.

    gnt-instance console mail.kosmos.sg
    Check: -vnc host:d
    By convention the TCP port is 5900+d

    To mount instance disk on host,
    Read: http://alexeytorkhov.blogspot.com/2009/09/mounting-raw-and-qcow2-vm-disk-images.html
    lvdisplay
    losetup /dev/loop0
    kpartx -a /dev/loop0
    mount /dev/mapper/loop0p1 /mnt/image

  4. herky

    Hi,

    kvm and the console wont work together nicely. try binding the instance to vnc (0.0.0.0) and use a vnc viewer to connect to your instance – to check for the port of your instance do gnt-instance info instanceofyours.

    greets

  5. Pingback: Bookmarks for June 20th through December 13th — Somewhere out there!

  6. Pingback: Ganeti by gregers - Pearltrees

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.