Running Ubuntu cloud-images, with cloud-init, on all infrastructure, from cloud to bare metal

So we’re happy provisioning our AWS, GCP, DigitalOcean, Azure, and other virtual machines with cloud-init.

In the cloud, the providers offer “metadata” support (sometimes called user-data) and mostly clean, working, updated Ubuntu images which start cloud-init, find the metadata (usually through the “magic” address 169.254.169.254) and away we go.

Off the public clouds, though, it’s a different beast. The good news is, we can make it work, and unify all provisioning, from developer boxes to production physical machines, including private clouds.

Ubuntu provides wonderful little images, updated almost daily, released in-line with the AWS AMI, at https://cloud-images.ubuntu.com/ – for a 16.04LTS image, fully updated, go to https://cloud-images.ubuntu.com/xenial/current/ – there’s many formats available: QCOW2, VHD, OVA, etc. Thanks Ubuntu!

Using qemu-img utility, you can convert say the QCOW2 image (xenial-server-cloudimg-amd64-disk1.img) into any other format, eg VMDK, or even “raw”.

We’ve also exploited cloud-init’s “nocloud” datasource, to boot regular VMWare Workstation/Fusion/etc, using an ISO image containing the meta-data. The first virtual harddisk is a converted cloud-image from Ubuntu, and a virtual CDROM ISO is attached. The tiny ISO is prepared using “cloud-localds” from the cloud-image-utils package. This has been documented since 2013 by Scott Moser, the father of cloud-init.

In a KVM/libvirt environment, it’s even easier. KVM/libvirt allow you to “direct kernel boot” using the kernel and initrd, passing kernel command-line parameters directly, and with that you can coerce cloud-init by adding a kernel parameter “ds=nocloud-net;s=http://your/data/source”. Works beautifully. (nocloud-net adds “meta-data” and “user-data” to the end of the URL automatically).

For vSphere/Free ESXi deployments, even though the ISO trick would work, what we do is prepare a VMDK image already setup: before converting the image to VMDK, we mount it temporarily (using qemu-nbd), change a few configuration options, and add the kernel cmdline parameters directly to /boot/grub/grub.cfg. Also, depending on the VMWare environment, it’s also helpful to pre-set the network address configuration, etc. After that we dismount, convert to VMDK and deploy with VMWare’s ovftool directly to ESXi.

The most recent challenge was how to provision physical, bare-metal servers using the same method, so that we can re-use all our cloud-init scripts and infrastructure.

Around the internet most people are using network boot (with PXE). Unfortunately the env we’re working with does not support PXE for unrelated reasons, so we gotta do something less standard.

Current process is almost the same as the ESXi case above: we prepare a cloud-image (from QCOW2 ubuntu source), mount it, and preconfigure network/grub/etc. This image is converted to “raw” format, and made available via an HTTP endpoint.

At the physical machine, things get a lot more manual: we use an Ubuntu rescue image (regular mini.iso from Ubuntu distro), and boot it on the physical hardware. Rescue sets up networking, disk access, and keyboard. We drop to a shell, without mounting any filesystems, and issue something along the lines of
wget -O - http://somehost/image_we_prepared.img | dd bs=2M of=/dev/sda

This unceremoniously destroys the contents of the physical machine, replacing it with the cloud-image we prepared, partition table and all. After that, reboot, and voilá, there goes cloud-init, exactly as before.

There’s a gotcha in this process: the cloud-image does not contain most kernel drivers needed for a physical machine, in my case using a Dell 1950, the PERC’s drivers (megaraid_sas) were not included, resulting in a failed boot. To fix this, ‘rescue’ into the image, and install linux-image-generic, and run update-initramfs -u. This is something we’re working on to move to the image-preparation automated step.

This is all done using DRAC virtual media and virtual console, but would work equally well using a burned CD with mini.iso and remote hands at the datacenter.

For the future (depending on demand for bare metal provisioning, we’ve only had to provision a few hosts until now), I’d like to create my own mini.iso which automates the process of dd-ing the image and the fixes; but that sounds like a lot of work, and an equivalent PXE setup seems much more sane, CoreOS seems to have nailed this already.

cloudinit all the things!

Leave a Reply