AoE root for KVM guests

Intro

So. I’m trying to get familiar with libvirt and friends. To this end, I’ve set up a Lucid virtual machine booting from PXE into an initrd environment which does a pivot_root to an AoE block device.

The #virt channel on irc.oftc.net told me that in order to have libvirt provide PXE capability, I would have to install a recent version of libvirt. I built version 0.7.5-3 from sid on my karmic laptop and it seems to be working okay.

I decided to set up the pxe root directoy in /var/lib/tftproot just because that’s what the example code had in it.

Configure the Virtual Network

I had to manually configure a virtual network. Here is the XML config file:

$ sudo virsh net-dumpxml netboot
<network>
  <name>netboot</name>
  <uuid>81ff0d90-c91e-6742-64da-4a736edb9a9b</uuid>
  <forward mode='nat'/>
  <bridge name='virbr1' stp='off' delay='1' />
  <domain name='example.com'/>
  <ip address='192.168.123.1' netmask='255.255.255.0'>
    <tftp root='/var/lib/tftproot' />
    <dhcp>
      <range start='192.168.123.2' end='192.168.123.254' />
      <bootp file='pxelinux.0' />
    </dhcp>
  </ip>
</network>

Install syslinux

This, of course, depends on the pxelinux.0 file. Luckily, this is packaged up in syslinux and can be installed with a simple


$ sudo apt-get install syslinux
$ sudo mkdir /var/lib/tftproot
$ sudo cp /usr/lib/syslinux/pxelinux.0 /var/lib/tftproot

Configure PXE boot parameters

I had to create a pxelinux config file for the virtual machine (indexed by mac address). Note that I put a console=ttyS0,115200 argument on the kernel command line so that I can attach to the serial port from the host system for copy/paste debugging. Also of importance is the root=/dev/etherd/e0.1p1 argument, specifying which block device we’ll be doing the pivot_root to eventually.


$ mkdir /var/lib/tftproot/pxelinux.cfg/
$ cat /var/lib/tftproot/pxelinux.cfg/01-52-54-00-44-34-67
DEFAULT linux
LABEL linux
SAY Now booting the kernel from PXELINUX...
KERNEL vmlinuz-lucid0
APPEND ro root=/dev/etherd/e0.1p1 console=ttyS0,115200 initrd=initrd.img-lucid0

I decided to use the karmic kernel for lucid initially. I’ll eventually switch over to the lucid kernel ;)


$ sudo cp /boot/vmlinuz-2.6.31-17-generic /var/lib/tftproot/vmlinuz-lucid0

Customize initramfs-tools

I copied /etc/initramfs-tools to ~/tmp/lucid so that I didn’t mess up the system initrd scripts:


$ mkdir -p ~/tmp/lucid && cp -r /etc/initramfs-tools ~/tmp/lucid/

Since mkinitramfs doesn’t currently have a system for AoE root, I had to do a bit of fiddling. I copied the NFS root boot script and made a couple of modifications.

$ diff -u /usr/share/initramfs-tools/scripts/nfs ~/tmp/lucid/initramfs-tools/scripts/aoe
--- /usr/share/initramfs-tools/scripts/nfs	2008-06-23 23:10:21.000000000 -0700
+++ /home/cjac/tmp/lucid/initramfs-tools/scripts/aoe	2010-01-15 14:56:28.098298027 -0800
@@ -5,59 +5,25 @@
 retry_nr=0
 
 # parse nfs bootargs and mount nfs 
-do_nfsmount()
+do_aoemount()
 {
-
 	configure_networking
 
-	# get nfs root from dhcp
-	if [ "x${NFSROOT}" = "xauto" ]; then
-		# check if server ip is part of dhcp root-path
-		if [ "${ROOTPATH#*:}" = "${ROOTPATH}" ]; then
-			NFSROOT=${ROOTSERVER}:${ROOTPATH}
-		else
-			NFSROOT=${ROOTPATH}
-		fi
-
-	# nfsroot=[<server-ip>:]<root-dir>[,<nfs-options>]
-	elif [ -n "${NFSROOT}" ]; then
-		# nfs options are an optional arg
-		if [ "${NFSROOT#*,}" != "${NFSROOT}" ]; then
-			NFSOPTS="-o ${NFSROOT#*,}"
-		fi
-		NFSROOT=${NFSROOT%%,*}
-		if [ "${NFSROOT#*:}" = "$NFSROOT" ]; then
-			NFSROOT=${ROOTSERVER}:${NFSROOT}
-		fi
-	fi
+        ip link set up dev eth0
 
-	if [ -z "${NFSOPTS}" ]; then
-		NFSOPTS="-o retrans=10"
-	fi
+        ls /dev/etherd/
 
-	[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/nfs-premount"
-	run_scripts /scripts/nfs-premount
-	[ "$quiet" != "y" ] && log_end_msg
+        echo > /dev/etherd/discover
 
-	if [ ${readonly} = y ]; then
-		roflag="-o ro"
-	else
-		roflag="-o rw"
-	fi
+        ls /dev/etherd/
 
-	nfsmount -o nolock ${roflag} ${NFSOPTS} ${NFSROOT} ${rootmnt}
+        mount ${ROOT} ${rootmnt}
 }
 
-# NFS root mounting
+# AoE root mounting
 mountroot()
 {
-	[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/nfs-top"
-	run_scripts /scripts/nfs-top
-	[ "$quiet" != "y" ] && log_end_msg
-
-	modprobe nfs
-	# For DHCP
-	modprobe af_packet
+	modprobe aoe
 
 	# Default delay is around 180s
 	# FIXME: add usplash_write info
@@ -67,17 +33,13 @@
 		delay=${ROOTDELAY}
 	fi
 
-	# loop until nfsmount succeds
+	# loop until aoemount succeds
 	while [ ${retry_nr} -lt ${delay} ] && [ ! -e ${rootmnt}${init} ]; do
 		[ ${retry_nr} -gt 0 ] && \
-		[ "$quiet" != "y" ] && log_begin_msg "Retrying nfs mount"
-		do_nfsmount
+		[ "$quiet" != "y" ] && log_begin_msg "Retrying AoE mount"
+		do_aoemount
 		retry_nr=$(( ${retry_nr} + 1 ))
 		[ ! -e ${rootmnt}${init} ] && /bin/sleep 1
 		[ ${retry_nr} -gt 0 ] && [ "$quiet" != "y" ] && log_end_msg
 	done
-
-	[ "$quiet" != "y" ] && log_begin_msg "Running /scripts/nfs-bottom"
-	run_scripts /scripts/nfs-bottom
-	[ "$quiet" != "y" ] && log_end_msg
 }

(below is the full file in case udiff is less convenient)

$ cat ~/tmp/lucid/initramfs-tools/scripts/aoe
# NFS filesystem mounting			-*- shell-script -*-

# FIXME This needs error checking

retry_nr=0

# parse nfs bootargs and mount nfs 
do_aoemount()
{
	configure_networking

        ip link set up dev eth0

        ls /dev/etherd/

        echo > /dev/etherd/discover

        ls /dev/etherd/

        mount ${ROOT} ${rootmnt}
}

# AoE root mounting
mountroot()
{
	modprobe aoe

	# Default delay is around 180s
	# FIXME: add usplash_write info
	if [ -z "${ROOTDELAY}" ]; then
		delay=180
	else
		delay=${ROOTDELAY}
	fi

	# loop until aoemount succeds
	while [ ${retry_nr} -lt ${delay} ] && [ ! -e ${rootmnt}${init} ]; do
		[ ${retry_nr} -gt 0 ] && \
		[ "$quiet" != "y" ] && log_begin_msg "Retrying AoE mount"
		do_aoemount
		retry_nr=$(( ${retry_nr} + 1 ))
		[ ! -e ${rootmnt}${init} ] && /bin/sleep 1
		[ ${retry_nr} -gt 0 ] && [ "$quiet" != "y" ] && log_end_msg
	done
}

There was also a small modification to the initramfs.conf file:

$ diff -u /etc/initramfs-tools/initramfs.conf ~/tmp/lucid/initramfs-tools/initramfs.conf
--- /etc/initramfs-tools/initramfs.conf	2008-07-08 18:37:42.000000000 -0700
+++ /home/cjac/tmp/lucid/initramfs-tools/initramfs.conf	2010-01-15 14:33:38.088295207 -0800
@@ -47,14 +47,16 @@
 #
 
 #
-# BOOT: [ local | nfs ]
+# BOOT: [ local | nfs | aoe]
 #
 # local - Boot off of local media (harddrive, USB stick).
 #
 # nfs - Boot using an NFS drive as the root of the drive.
 #
+# aoe - Boot using an AoE drive as the root of the drive.
+#
 
-BOOT=local
+BOOT=aoe
 
 #
 # DEVICE: ...

I also needed to add aoe to the list of modules included in the initramfs:


$ echo aoe >> ~/tmp/lucid/initramfs-tools/modules

In order to generate the initrd.img file from this new config, I ran the following:


$ sudo mkinitramfs -d ~/tmp/lucid/initramfs-tools/ -o /var/lib/tftproot/initrd.img-lucid0

Install OS to virtual block device

I created a lucid VM by installing from the desktop install disk. You can grab the ISO here:

http://cdimage.ubuntu.com/daily-live/current/

I’ll leave the creation of the virtual machine and installation as an exercise for the reader. I put the filesystem on an lvm volume group called vg0 in a logical volume called lucid0 (ie, /dev/vg0/lucid0).

Create virtual machine definition with virsh

At this point, I created a new virtual machine called lucid0. Here is the xml for the domain:

$ sudo virsh dumpxml lucid0
<domain type='kvm' id='1'>
  <name>lucid0</name>
  <uuid>96fbad21-4f25-5700-ddd8-1a565c7170ee</uuid>
  <memory>524288</memory>
  <currentMemory>524288</currentMemory>
  <vcpu>1</vcpu>
  <os>
    <type arch='x86_64' machine='pc-0.11'>hvm</type>
    <boot dev='network'/>
  </os>
  <features>
    <pae/>
  </features>
  <clock offset='localtime'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/bin/kvm</emulator>
    <interface type='network'>
      <mac address='52:54:00:44:34:67'/>
      <source network='netboot'/>
      <target dev='vnet0'/>
    </interface>
    <serial type='pty'>
   <source path='/dev/pts/4'/>
      <target port='0'/>
    </serial>
  <console type='pty' tty='/dev/pts/4'>
   <source path='/dev/pts/4'/>
      <target port='0'/>
    </console>
    <input type='tablet' bus='usb'/>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='5901' autoport='yes' listen='127.0.0.1' keymap='en-us'/>
    <sound model='es1370'/>
    <video>
      <model type='cirrus' vram='9216' heads='1'/>
    </video>
  </devices>
</domain>

Start AoE target

Now we’re ready to start the AoE target and launch the virtual machine. If you don’t have vblade installed, do so now:


$ sudo apt-get install vblade

Start the target up with the following command:


$ sudo vbladed 0 1 virbr1 /dev/vg0/lucid0

Boot the virtual machine

Now, if all goes well, you should be able to watch the virtual machine boot up and do its thing like so:


$ sudo virsh start lucid0 && sudo screen -S lucid0 `sudo virsh ttyconsole lucid0` 115200

If you get errors about /dev/etherd/e0.1p1 not existing (these might look like this):

Begin: Retrying AoE mount ...
err         discover    interfaces  revalidate  flush
err         discover    interfaces  revalidate  flush
mount: mounting /dev/etherd/e0.1p1 on /root failed: No such file or directory
Done.

then you might want to try restarting vbladed like this:


$ sudo kill -9 `ps auwx | grep vblade | grep -v grep | awk '{print $2}' ` && sudo vbladed 0 1 virbr1 /dev/vg0/lucid0

Questions? Comments?

So. Now you should have a lucid gdm in your virt-manager console. Any questions? #virt on irc.oftc.net

Also, feel free to email me

This entry was posted in debian, dns, Free Software, Hardware, irc, libvirt, linux, lvm, Networking, qemu, Software, ubuntu, virtualization, xen. Bookmark the permalink.

One Response to AoE root for KVM guests

  1. (08:20:10) cj: MikeN: hey, look who’s the second result for ‘aoe root’ in google search results
    (08:53:02) MikeN: aww
    (08:53:13) MikeN: we’re like AoE root filesystem BFFs
    (09:11:13) MikeN: jlott is #4 there
    (09:38:21) cj: we could be like super heroes and save the universe with our aoe root awesomeness.
    (09:38:40) MikeN: yep
    (09:38:43) MikeN: it would rule

Leave a Reply