Running Ubuntu cloud-images, with cloud-init, on all infrastructure, from cloud to bare metal

So we’re happy provisioning our AWS, GCP, DigitalOcean, Azure, and other virtual machines with cloud-init.

In the cloud, the providers offer “metadata” support (sometimes called user-data) and mostly clean, working, updated Ubuntu images which start cloud-init, find the metadata (usually through the “magic” address 169.254.169.254) and away we go.

Off the public clouds, though, it’s a different beast. The good news is, we can make it work, and unify all provisioning, from developer boxes to production physical machines, including private clouds.

Ubuntu provides wonderful little images, updated almost daily, released in-line with the AWS AMI, at https://cloud-images.ubuntu.com/ – for a 16.04LTS image, fully updated, go to https://cloud-images.ubuntu.com/xenial/current/ – there’s many formats available: QCOW2, VHD, OVA, etc. Thanks Ubuntu!

Using qemu-img utility, you can convert say the QCOW2 image (xenial-server-cloudimg-amd64-disk1.img) into any other format, eg VMDK, or even “raw”.

We’ve also exploited cloud-init’s “nocloud” datasource, to boot regular VMWare Workstation/Fusion/etc, using an ISO image containing the meta-data. The first virtual harddisk is a converted cloud-image from Ubuntu, and a virtual CDROM ISO is attached. The tiny ISO is prepared using “cloud-localds” from the cloud-image-utils package. This has been documented since 2013 by Scott Moser, the father of cloud-init.

In a KVM/libvirt environment, it’s even easier. KVM/libvirt allow you to “direct kernel boot” using the kernel and initrd, passing kernel command-line parameters directly, and with that you can coerce cloud-init by adding a kernel parameter “ds=nocloud-net;s=http://your/data/source”. Works beautifully. (nocloud-net adds “meta-data” and “user-data” to the end of the URL automatically).

For vSphere/Free ESXi deployments, even though the ISO trick would work, what we do is prepare a VMDK image already setup: before converting the image to VMDK, we mount it temporarily (using qemu-nbd), change a few configuration options, and add the kernel cmdline parameters directly to /boot/grub/grub.cfg. Also, depending on the VMWare environment, it’s also helpful to pre-set the network address configuration, etc. After that we dismount, convert to VMDK and deploy with VMWare’s ovftool directly to ESXi.

The most recent challenge was how to provision physical, bare-metal servers using the same method, so that we can re-use all our cloud-init scripts and infrastructure.

Around the internet most people are using network boot (with PXE). Unfortunately the env we’re working with does not support PXE for unrelated reasons, so we gotta do something less standard.

Current process is almost the same as the ESXi case above: we prepare a cloud-image (from QCOW2 ubuntu source), mount it, and preconfigure network/grub/etc. This image is converted to “raw” format, and made available via an HTTP endpoint.

At the physical machine, things get a lot more manual: we use an Ubuntu rescue image (regular mini.iso from Ubuntu distro), and boot it on the physical hardware. Rescue sets up networking, disk access, and keyboard. We drop to a shell, without mounting any filesystems, and issue something along the lines of
wget -O - http://somehost/image_we_prepared.img | dd bs=2M of=/dev/sda

This unceremoniously destroys the contents of the physical machine, replacing it with the cloud-image we prepared, partition table and all. After that, reboot, and voilá, there goes cloud-init, exactly as before.

There’s a gotcha in this process: the cloud-image does not contain most kernel drivers needed for a physical machine, in my case using a Dell 1950, the PERC’s drivers (megaraid_sas) were not included, resulting in a failed boot. To fix this, ‘rescue’ into the image, and install linux-image-generic, and run update-initramfs -u. This is something we’re working on to move to the image-preparation automated step.

This is all done using DRAC virtual media and virtual console, but would work equally well using a burned CD with mini.iso and remote hands at the datacenter.

For the future (depending on demand for bare metal provisioning, we’ve only had to provision a few hosts until now), I’d like to create my own mini.iso which automates the process of dd-ing the image and the fixes; but that sounds like a lot of work, and an equivalent PXE setup seems much more sane, CoreOS seems to have nailed this already.

cloudinit all the things!

Fully managed VMWare ESXi 5.1 with Dell Servers

Update 12/August: There’s new versions of both ESXi 5.1 (Update 1) and Dell VIB. The instructions below should continue to work, just remember to use the updated filenames from the downloads. Download VMware ESXi 5.1 Update 1 Recovery Image (released on April 29, 2013) and Dell OpenManage Server Administrator vSphere Installation Bundle (VIB) for ESXi 5.1 (released on August 12, 2013)

Update 12/April: There’s a problem with OMSA and ESXi 5.1 and (atleast) R710 servers. Check out http://communities.vmware.com/thread/439083 and call your Dell/VMWare rep.

You’ve got a newish Dell server (11th generation or newer) and you want to run a fully managed (OMSA, monitoring, SNMP) free ESXi 5.1 system. Continue reading Fully managed VMWare ESXi 5.1 with Dell Servers

Running pfSense nanobsd-vga on VMWare

I’ve a couple old x86 servers. They’re great servers, except for failed old SCSI disks, or controller’s battary has failed, etc. I want to make stable firewall/routers out of them.
Get pfSense (2.0.1+) nanobsd-vga and burn the image via physdiskwrite to a el-cheapo 4gb memory stick.
nanobsd runs entirely from ramdisks, and is quite fast. You can configure pfSense to periodically write configs, RRD’s and other data back to stick, at 1 hour intervals.
nanobsd-vga allows you to enjoy nanobsd’s fully embedded nature, with a traditional x86 VGA console/monitor and keyboard, while nanobsd requires a serial console.
This all works very well on physical hardware, but how can we run this exact same configuration on VMWare, and enjoy full virtual networking for testing? Continue reading Running pfSense nanobsd-vga on VMWare

Old DELL CERC SATA monitoring on new Debian Squeeze (PowerEdge PE 830)

lspci says it’s an “Dell CERC SATA RAID 2 PCI SATA 6ch (DellCorsair)”, under “Adaptec AAC-RAID (rev 01)”. Dell OMSA 6.5.x installs perfectly and after reboot detects everything but the CERC controller; so add the repository as per the instructions at http://hwraid.le-vert.net/wiki/DebianPackages and then “apt-get install aacraid-status”. Then run “aacraid-status”. It should output something like Continue reading Old DELL CERC SATA monitoring on new Debian Squeeze (PowerEdge PE 830)