Category Archives: UEFI

Posts related to UEFI

Deploying Encrypted Images for Confidential Computing

In the previous post I looked at how you build an encrypted image that can maintain its confidentiality inside AMD SEV or Intel TDX. In this post I’ll discuss how you actually bring up a confidential VM from an encrypted image while preserving secrecy. However, first a warning: This post represents the state of the art and includes patches that are certainly not deployed in distributions and may not even be upstream, so if you want to follow along at home you’ll need to patch things like qemu, grub and OVMF. I should also add that, although I’m trying to make everything generic to confidential environments, this post is based on AMD SEV, which is the only confidential encrypted1 environment currently shipping.

The Basics of a Confidential Computing VM

At its base, current confidential computing environments are about using encrypted memory to run the virtual machine and guarding the encryption key so that the owner of the host system (the cloud service provider) can’t get access to it. Both SEV and TDX have the encryption technology inside the main memory controller meaning the L1 cache isn’t encrypted (still vulnerable to cache side channels) and DMA to devices must also be done via unencryped memory. This latter also means that both the BIOS and the Operating System of the guest VM must be enlightened to understand which pages to encrypted and which must not. For this reason, all confidential VM systems use OVMF2 to boot because this contains the necessary enlightening. To a guest, the VM encryption looks identical to full memory encryption on a physical system, so as long as you have a kernel which supports Intel or AMD full memory encryption, it should boot.

Each confidential computing system has a security element which sits between the encrypted VM and the host. In SEV this is an aarch64 processor called the Platform Security Processor (PSP) and in TDX it is an SGX enclave running Intel proprietary code. The job of the PSP is to bootstrap the VM, including encrypting the initial OVMF and inserting the encrypted pages. The security element also includes a validation certificate, which incorporates a Diffie-Hellman (DH) key. Once the guest owner obtains and validates the DH key it can use it to construct a one time ECDH encrypted bundle that can be passed to the security element on bring up. This bundle includes an encryption key which can be used to encrypt secrets for the security element and a validation key which can be used to verify measurements from the security element.

The way QEMU boots a Q35 machine is to set up all the configuration (including a disk device attached to the VM Image) load up the OVMF into rom memory and start the system running. OVMF pulls in the QEMU configuration and constructs the necessary ACPI configuration tables before executing grub and the kernel from the attached storage device. In a confidential VM, the first task is to establish a Guest Owner (the person whose encrypted VM it is) which is usually different from the Host Owner (the person running or controlling the Physical System). Ownership is established by transferring an encrypted bundle to the Secure Element before the VM is constructed.

The next step is for the VMM (QEMU in this case) to ask the secure element to provision the OVMF Firmware. Since the initial OVMF is untrusted, the Guest Owner should ask the Secure Element for an attestation of the memory contents before the VM is started. Since all paths lead through the Host Owner, who is also untrusted, the attestation contains a random nonce to prevent replay and is HMAC’d with a Guest Supplied key from the Launch Bundle. Once the Guest Owner is happy with the VM state, it supplies the Wrapped Key to the secure element (along with the nonce to prevent replay) and the Secure Element unwraps the key and provisions it to the VM where the Guest OS can use it for disc encryption. Finally, the enlightened guest reads the encrypted disk to unencrypted memory using DMA but uses the disk encryptor to decrypt it to encrypted memory, so the contents of the Encrypted VM Image are never visible to the Host Owner.

The Gaps in the System

The most obvious gap is that EFI booting systems don’t go straight from the OVMF firmware to the OS, they have to go via an EFI bootloader (grub, usually) which must be an efi binary on an unencrypted vFAT partition. The second gap is that grub must be modified to pick the disk encryption key out of wherever the Secure Element has stashed it. The third is that the key is currently stashed in VM memory before OVMF starts, so OVMF must know not to use or corrupt the memory. A fourth problem is that the current recommended way of booting OVMF has a flash drive for persistent variable storage which is under the control of the host owner and which isn’t part of the initial measurement.

Plugging The Gaps: OVMF

To deal with the problems in reverse order: the variable issue can be solved simply by not having a persistent variable store, since any mutable configuration information could be used to subvert the boot and leak the secret. This is achieved by stripping all the mutable variable handling out of OVMF. Solving key stashing simply means getting OVMF to set aside a page for a secret area and having QEMU recognise where it is for the secret injection. It turns out AMD were already working on a QEMU configuration table at a known location by the Reset Vector in OVMF, so the secret area is added as one of these entries. Once this is done, QEMU can retrieve the injection location from the OVMF binary so it doesn’t have to be specified in the QEMU Machine Protocol (QMP) command. Finally OVMF can protect the secret and package it up as an EFI configuration table for later collection by the bootloader.

The final OVMF change (which is in the same patch set) is to pull grub inside a Firmware Volume and execute it directly. This certainly isn’t the only possible solution to the problem (adding secure boot or an encrypted filesystem were other possibilities) but it is the simplest solution that gives a verifiable component that can be invariant across arbitrary encrypted boots (so the same OVMF can be used to execute any encrypted VM securely). This latter is important because traditionally OVMF is supplied by the host owner rather than being part of the VM image supplied by the guest owner. The grub script that runs from the combined volume must still be trusted to either decrypt the root or reboot to avoid leaking the key. Although the host owner still supplies the combined OVMF, the measurement assures the guest owner of its correctness, which is why having a fairly invariant component is a good idea … so the guest owner doesn’t have potentially thousands of different measurements for approved firmware.

Plugging the Gaps: QEMU

The modifications to QEMU are fairly simple, it just needs to scan the OVMF file to determine the location for the injected secret and inject it correctly using a QMP command.. Since secret injection is already upstream, this is a simple find and make the location optional patch set.

Plugging the Gaps: Grub

Grub today only allows for the manual input of the cryptodisk password. However, in the cloud we can’t do it this way because there’s no guarantee of a secure tty channel to the VM. The solution, therefore, is to modify grub so that the cryptodisk can use secrets from a provider, in addition to the manual input. We then add a provider that can read the efi configuration tables and extract the secret table if it exists. The current incarnation of the proposed patch set is here and it allows cryptodisk to extract a secret from an efisecret provider. Note this isn’t quite the same as the form expected by the upstream OVMF patch in its grub.cfg because now the provider has to be named on the cryptodisk command line thus

cryptodisk -s efisecret

but in all other aspects, Grub/grub.cfg works. I also discovered several other deviations from the initial grub.cfg (like Fedora uses /boot/grub2 instead of /boot/grub like everyone else) so the current incarnation of grub.cfg is here. I’ll update it as it changes.

Putting it All Together

Once you have applied all the above patches and built your version of OVMF with grub inside, you’re ready to do a confidential computing encrypted boot. However, you still need to verify the measurement and inject the encrypted secret. As I said before, this isn’t easy because, due to replay defeat requirements, the secret bundle must be constructed on the fly for each VM boot. From this point on I’m going to be using only AMD SEV as the example because the Intel hardware doesn’t yet exist and AMD kindly gave IBM research a box to play with (Anyone with a new EPYC 7xx1 or 7xx2 based workstation can likely play along at home, but check here). The first thing you need to do is construct a launch bundle. AMD has a tool called sev-tool to do this for you and the first thing you need to do is obtain the platform Diffie Hellman certificate (pdh.cert). The tool will extract this for you

sevtool --pdh_cert_export

Or it can be given to you by the cloud service provider (in this latter case you’ll want to verify the provenance using sevtool –validate_cert_chain, which contacts the AMD site to verify all the details). Once you have a trusted pdh.cert, you can use this to generate your own guest owner DH cert (godh.cert) which should be used only one time to give a semblance of ECDHE. godh.cert is used with pdh.cert to derive an encryption key for the launch bundle. You can generate this with

sevtool --generate_launch_blob <policy>

The gory details of policy are in the SEV manual chapter 3, but most guests use 1 which means no debugging. This command will generate the godh.cert, the launch_blob.bin and a tmp_tk.bin file which you must save and keep secure because it contains the Transport Encryption and Integrity Keys (TEK and TIK) which will be used to encrypt the secret. Figuring out the qemu command line options needed to launch and pause a SEV guest is a bit of a palaver, so here is mine. You’ll likely need to change things, like the QMP port and the location of your OVMF build and the launch secret.

Finally you need to get the launch measure from QMP, verify it against the sha256sum of OVMF.fd and create the secret bundle with the correct GUID headers. Since this is really fiddly to do with sevtool, I wrote this python script3 to do it all (note it requires qmp.py from the qemu git repository). You execute it as

sevsecret.py --passwd <disk passwd> --tiktek-file <location of tmp_tk.bin> --ovmf-hash <hash> --socket <qmp socket>

And it will verify the launch measure and encrypt the secret for the VM if the measure is correct and start the VM. If you got everything correct the VM will simply boot up without asking for a password (if you inject the wrong secret, it will still ask). And there you have it: you’ve booted up a confidential VM from an encrypted image file. If you’re like me, you’ll also want to fire up gdb on the qemu process just to show that the entire memory of the VM is encrypted …

Conclusions and Caveats

The above script should allow you to boot an encrypted VM anywhere: locally or in the cloud, provided you can access the QMP port (most clouds use libvirt which introduces yet another additional layering pain). The biggest drawback, if you refer to the diagram, is the yellow box: you must trust the secret element, which in both Intel and AMD is proprietary4, in order to get confidential computing to work. Although there is hope that in future the secret element could be fully open source, it isn’t today.

The next annoyance is that launching a confidential VM is high touch requiring collaboration from both the guest owner and the host owner (due to the anti-replay nonce). For a single launch, this is a minor annoyance but for an autoscaling (launch VMs as needed) platform it becomes a major headache. The solution seems to be to have some Hardware Security Module (HSM), like the cloud uses today to store encryption keys securely, and have it understand how to measure and launch encrypted VMs on behalf of the guest owner.

The final conclusion to remember is that confidentiality is not security: your VM is as exploitable inside a confidential encrypted VM as it was outside. In many ways confidentiality and security are opposites, in that security in part requires reducing the trusted code and confidentiality requires pulling as much as possible inside. Confidential VMs do have an answer to the Cloud trust problem since the enterprise can now deploy VMs without fear of tampering by the cloud provider, but those VMs are as insecure in the cloud as they were in the Enterprise Data Centre. All of this argues that Confidential Computing, while an important milestone, is only one step on the journey to cloud security.

Patch Status

The OVMF patches are upstream (including modifications requested by Intel for TDX). The QEMU and grub patch sets are still on the lists.

Building Encrypted Images for Confidential Computing

With both Intel and AMD announcing confidential computing features to run encrypted virtual machines, IBM research has been looking into a new format for encrypted VM images. The first question is why a new format, after all qcow2 only recently deprecated its old encrypted image format in favour of luks. The problem is that in confidential computing, the guest VM runs inside the secure envelope but the host hypervisor (including the QEMU process) is untrusted and thus runs outside the secure envelope and, unfortunately, even for the new luks format, the encryption of the image is handled by QEMU and so the encryption key would be outside the secure envelope. Thus, a new format is needed to keep the encryption key (and, indeed, the encryption mechanism) within the guest VM itself. Fortunately, encrypted boot of Linux systems has been around for a while, and this can be used as a practical template for constructing a fully confidential encrypted image format and maintaining that confidentiality within a hostile cloud environment. In this article, I’ll explore the state of the art in encrypted boot, constructing EFI encrypted boot images, and finally, in the follow on article, look at deploying an encrypted image into a confidential environment and maintaining key secrecy in the cloud.

Encrypted Boot State of the Art

Luks and the cryptsetup toolkit have been around for a while and recently (in 2018), the luks format was updated to version 2. However, actually booting a linux kernel from an encrypted partition has always been a bit of a systems problem, primarily because the bootloader (grub) must decrypt the partition to actually load the kernel. Fortunately, grub can do this, but unfortunately the current grub in most distributions (2.04) can only read the version 1 luks format. Secondly, the user must type the decryption passphrase into grub (so it can pull the kernel and initial ramdisk out of the encrypted partition to boot them), but grub currently has no mechanism to pass it on to the initial ramdisk for mounting root, meaning that either the user has to type their passphrase twice (annoying) or the initial ramdisk itself has to contain a file with the disk passphrase. This latter is the most commonly used approach and only has minor security implications when the system is in motion (the ramdisk and the key file must be root read only) and the password is protected at rest by the fact that the initial ramdisk is also on the encrypted volume. Even more annoying is the fact that there is no distribution standard way of creating the initial ramdisk. Debian (and Ubuntu) have the most comprehensive documentation on how to do this, so the next section will look at the much less well documented systemd/dracut mechanism.

Encrypted Boot for Systemd/Dracut

Part of the problem here seems to be less that stellar systems co-ordination between the two components. Additionally, the way systemd supports passphraseless encrypted volumes has been evolving for a while but changed again in v246 to mirror the Debian method. Since cloud images are usually pretty up to date, I’ll describe this new way. Each encrypted volume is referred to by UUID (which will be the UUID of the containing partition returned by blkid). To get dracut to boot from an encrypted partition, you must pass in

rd.luks.uuid=<UUID>

but you must also have a key file named

/etc/cryptsetup-keys.d/luks-<UUID>.key

And, since dracut hasn’t yet caught up with this, you usually need a cryptodisk.conf file in /etc/dracut.conf.d/ which contains

install_items+=" /etc/cryptsetup-keys.d/* "

Grub and EFI Booting Encrypted Images

Traditionally grub is actually installed into the disk master boot record, but for EFI boot that changed and the disk (or VM image) must have an EFI System partition which is where the grub.efi binary is installed. Part of the job of the grub.efi binary is to find the root partition and source the /boot/grub5/grub.cfg. When you install grub on an EFI partition a search for the root by UUID is actually embedded into the grub binary. Another problem is likely that your distribution customizes the location of grub and updates the boot variables to tell the system where it is. However, a cloud image can’t rely on the boot variables and must be installed in the default location (\EFI\BOOT\bootx64.efi). This default location can be achieved by adding the –removable flag to grub-install.

For encrypted boot, this becomes harder because the grub in the EFI partition must set up the cryptographic location by UUID. However, if you add

GRUB_ENABLE_CRYPTODISK=y

To /etc/default/grub it will do the necessary in grub-install and grub-mkconfig. Note that on Fedora, where every other GRUB_ENABLE parameter is true/false, this must be ‘y’, unfortunately grub-install will look for =y not =true.

Putting it all together: Encrypted VM Images

Start by extracting the root of an existing VM image to a tar file. Make sure it has all the tools you will need, like cryptodisk and grub-efi. Create a two partition raw image file and loopback mount it (I usually like 4GB) with a small efi partition (p1) and an encrypted root (p2):

truncate -s 4GB disk.img
parted disk.img mklabel gpt
parted disk.img mkpart primary 1Mib 100Mib
parted disk.img mkpart primary 100Mib 100%
parted disk.img set 1 esp on
parted disk.img set 1 boot on

Now setup the efi and cryptosystem (I use ext4, but it’s not required). Note at this time luks will require a password. Use a simple one and change it later. Also note that most encrypted boot documents advise filling the encrypted partition with random numbers. I don’t do this because the additional security afforded is small compared with the advantage of converting the raw image to a smaller qcow2 one.

losetup -P -f disk.img          # assuming here it uses loop0
l=($(losetup -l|grep disk.img)) # verify with losetup -l
mkfs.vfat ${l}p1
blkid ${l}p1       # remember the EFI partition UUID
cryptsetup --type luks1 luksFormat ${l}p2 # choose temp password
blkid ${l}p2       # remember this as <UUID> you'll need it later 
cryptsetup luksOpen ${l}p2 cr_root
mkfs.ext4 /dev/mapper/cr_root
mount /dev/mapper/cr_root /mnt
tar -C /mnt -xpf <vm root tar file>
for m in run sys proc dev; do mount --bind /$m /mnt/$m; done
chroot /mnt

Create or modify /etc/fstab to have root as /dev/disk/cr_root and the EFI partition by label under /boot/efi. Now set up grub for encrypted boot6

echo "GRUB_ENABLE_CRYPTODISK=y" >> /etc/default/grub
mount /boot/efi
grub-install --removable --target=x86_64-efi
grub-mkconfig -o /boot/grub/grub.cfg

For Debian, you’ll need to add an /etc/crypttab entry for the encrypted disk:

cr_root UUID=<uuid> luks none

And then re-create the initial ramdisk. For dracut systems, you’ll have to modify /etc/default/grub so the GRUB_CMDLINE_LINUX has a rd.luks.uuid=<UUID> entry. If this is a selinux based distribution, you may also have to trigger a relabel.

Now would also be a good time to make sure you have a root password you know or to install /root/.ssh/authorized_keys. You should unmount all the binds and /mnt and try EFI booting the image. You’ll still have to type the password a couple of times, but once the image boots you’re operating inside the encrypted envelope. All that remains is to create a fast boot high entropy low iteration password and replace the existing one with it and set the initial ramdisk to use it. This example assumes your image is mounted as SCSI disk sda, but it may be a virtual disk or some other device.

dd if=/dev/urandom bs=1 count=33|base64 -w 0 > /etc/cryptsetup-keys.d/luks-<UUID>.key
chmod 600 /etc/cryptsetup-keys.d/luks-<UUID>.key
cryptsetup --key-slot 1 luksAddKey /dev/sda2 # permanent recovery key
cryptsetup --key-slot 0 luksRemoveKey /dev/sda2 # remove temporary
cryptsetup --key-slot 0 --iter-time 1 luksAddKey /dev/sda2 /etc/cryptsetup-keys.d/luks-<UUID>.key

Note the “-w 0” is necessary to prevent the password from having a trailing newline which will make it difficult to use. For mkinitramfs systems, you’ll now need to modify the /etc/crypttab entry

cr_root UUID=<UUID> /etc/cryptsetup-keys.d/luks-<UUID>.key luks

For dracut you need the key install hook in /etc/dracut.conf.d as described above and for Debian you need the keyfile pattern:

echo "KEYFILE_PATTERN=\"/etc/cryptsetup-keys.d/*\"" >>/etc/cryptsetup-initramfs/conf-hook

You now rebuild the initial ramdisk and you should now be able to boot the cryptosystem using either the high entropy password or your rescue one and it should only prompt in grub and shouldn’t prompt again. This image file is now ready to be used for confidential computing.

efitools for arm released

Thanks to using architecture emulation containers, I can now build and test efitools for arm (both 64 bit: aarch64 and 32 bit armv7l) on my x86_64 laptop.  To test the actual EFI binaries, you need the arm equivalent of OVMF which is ArmVirt.  I’ve got the build service spitting out the images (still called OVMF because of package naming requirements) for those who want to try this at home.  To run the UEFI image, you need qemu-system-<arch> which, on SUSE, is provided by the qemu-arm package.  The Linaro ARM64 site has detailed instructions, but for the impatient, the required command line (for aarch64) is

qemu-system-aarch64 \
        -m 2048 \
        -cpu cortex-a57 \
        -M virt \
        -bios /home/jejb/ovmf-aarch64.bin \
        -serial stdio \
        -drive if=none,file=fat:.,id=hd0 \
        -device virtio-blk-device,drive=hd0

Which brings up the UEFI bios image and mounts the current directory as an additional fat volume.  The Linaro site recommends 1024 for the memory size, but if you want to boot a kernel or do anything fancy, you’ll find 2048 is the minimum.  The same thing will work for 32 bit arm (except that you want to remove the -cpu line).

To get efitools to build, I had to get the arm versions of sbsigntools up and running, which I have.  The source tree I used for this is here.  There were a few modifications necessary to build arm, but nothing drastic.  Since the Ubuntu version of the git tree appears to be dead, I’ve merged back all the ubuntu patches and will maintain this going forwards.  I’ll bump the release version to 0.8 when I’m sure everything is stable.

Status

Using this, the current status is that the aarch64 binaries are all running fine and have verified nicely.  Unfortunately, there’s still some problem with generating efi binaries for the 32 bit arm systems.  Most of the simple binaries are running OK, but the more complex ones (like KeyTool) are showing symptoms of data relocation problems which I’m still investigating.  I’ll push a new release of efitools when I can get this sorted out, although it looks like a more generic problem inside gnu-efi itself.  Edk2 builds of am32 efi binaries seem to be working, so I’ll see if I can deduce what’s missing from gnu-efi.

Anatomy of the UEFI Boot Sequence on the Intel Galileo

The Basics

UEFI boot officially has three phases (SEC, PEI and DXE).  However, the DXE phase is divided into DXEBoot and DXERuntime (the former is eliminated after the call to ExitBootSerivices()).  The jobs of each phase are

  1. SEC (SECurity phase). This contains all the CPU initialisation code from the cold boot entry point on.  It’s job is to set the system up far enough to find, validate, install and run the PEI.
  2. PEI (Pre-Efi Initialization phase).  This configures the entire platform and then loads and boots the DXE.
  3. DXE (Driver eXecution Environment).  This is where the UEFI system loads drivers for configured devices, if necessary; mounts drives and finds and executes the boot code.  After control is transferred to the boot OS, the DXERuntime stays resident to handle any OS to UEFI calls.

How it works on Quark

This all sounds very simple (and very like the way an OS like Linux boots up).  However, there’s a very crucial difference: The platform really is completely unconfigured when SEC begins.  In particular it won’t have any main memory, so you begin in a read only environment until you can configure some memory.  C code can’t begin executing until you’ve at least found enough writable memory for a stack, so the SEC begins in hand crafted assembly until it can set up a stack.

On all x86 processors (including the Quark), power on begins execution in 16 bit mode at the ResetVector (0xfffffff0). As a helping hand, the default power on bus routing has the top 128KB of memory mapped into the top of SPI flash (read only, of course) via a PCI routing in the Legacy Bridge, meaning that the reset vector executes directly from the SPI Flash (this is actually very slow: SPI means Serial Peripheral Interface, so every byte of SPI flash has to be read serially into the instruction cache before it can be executed).

The hand crafted assembly clears the cache, transitions to Flat32 bit execution mode and sets up the necessary x86 descriptor tables.  It turns out that memory configuration on the Quark SoC is fairly involved and complex so, in order to spare the programmer from having to do this all in assembly, there’s a small (512kB) static ram chip that can be easily configured, so the last assembly job of the SEC is to configure the eSRAM (to a fixed address at 2GB), set the top as the stack, load the PEI into the base (by reconfiguring the SPI flash mapping to map the entire 8MB flash to the top of memory and then copying the firmware volume containing the PEI) and begin executing.

QuarkPlatform Build Oddities

Usually the PEI code is located by the standard Flash Volume code of UEFI and the build time PCDs (Platform Configuration Database entries) which use the values in the Flash Definition File to build the firmware.  However, the current Quark Platform package has a different style because it rips apart and rebuilds the flash volumes, so instead of using PCDs, it uses something it calls Master Flash Headers (MFHs) which are home grown for Quark.  These are a fixed area of the flash that can be read as a database giving the new volume layout (essentially duplicating what the PCDs would normally have done).  Additionally the Quark adds a non-standard signature header occupying 1k to each flash volume which serves two purposes: For the SECURE_LD case, it actually validates the volume, but for the three items in the firmware that don’t have flash headers (the kernel, the initrd and the grub config) it serves to give the lengths of each.

Laying out Flash Rom

This is a really big deal for most embedded systems because the amount of flash available is really limited.  The Galileo board is nice because it supplies 8MB of flash … which is huge in embedded terms.  All flash is divided into Flash Volumes7.  If you look at OVMF for instance, it builds its flash as four volumes: Three for the three SEC, PEI and DXE phases and one for the EFI variables.  In EdkII, flash files are built by the flash definition file (the one with a .fdf ending).  Usually some part of the flash is compressed and has to be inflated into memory (in OVMF this is PEI and DXE) and some are designed to be execute in place (usually SEC).  If you look at the Galileo layout, you see that it has a big SEC phase section (called BOOTROM_OVERRIDE) designed for the top 128kb of the flash , the usual variable area and then five additional sections, two for PEI and DXE and three recovery ones. (and, of course, an additional payload section for the OS that boots from flash).

Embedded Recovery Sections

For embedded devices (and even normal computers) recovery in the face of flash failure (whether from component issues or misupdate of the flash) is really important, so the Galileo follows a two stage fallback process.  The first stage is to detect a critical error signalled by the platform sticky bit, or recovery strap in the SEC and boot up to the fixed phase recovery which tries to locate a recovery capsule on the USB media8. The other recovery is a simple copy of the PEI image for fallback in case the primary PEI image fails (by now you’ll have realised there are three separate but pretty much identical copies of PEI in the flash rom).  One of the first fixes that can be made to the Quark build is to consolidate all of these into a single build description.

Putting it all together: Implementing a compressed PEI Phase

One of the first things I discovered when trying to update the UEFI version to something more modern is that the size of the PEI phase overflows the allowed size of the firmware volume.  This means either redo the flash layout or compress the PEI image.  I chose the latter and this is the story of how it went.

The first problem is that debug prints don’t work in the SEC phase, which is where the changes are going to have to be.  This is a big stumbling block because without debugging, you never know where anything went wrong.  It turns out that UEFI nicely supports this via a special DebugLib that outputs to the serial console, but that the Galileo firmware build has this disabled by this line:

[LibraryClasses.IA32.SEC]
...
 DebugLib|MdePkg/Library/BaseDebugLibNull/BaseDebugLibNull.inf

The BaseDebugLibNull does pretty much what you expect: throws away all Debug messages.  When this is changed to something that outputs messages, the size of the PEI image explodes again, mainly because Stage1 has all the SEC phase code in it.  The fix here is only to enable debugging inside the QuarkResetVector SEC phase code.  You can do this in the .dsc file with

 QuarkPlatformPkg/Cpu/Sec/ResetVector/QuarkResetVector.inf {
   <LibraryClasses>
     DebugLib|MdePkg/Library/BaseDebugLibSerialPort/BaseDebugLibSerialPort.inf
 }

And now debugging works in the SEC phase!

It turns out that a compressed PEI is possible but somewhat more involved than I imagined so that will be the subject of the next blog post.  For now I’ll round out with other oddities I discovered along the way

Quark Platform SEC and PEI Oddities

On the current quark build, the SEC phase is designed to be installed into the bootrom from 0xfffe 0000 to 0xffff ffff.  This contains a special copy of the reset vector (In theory it contains the PEI key validation for SECURE_LD, but in practise the verifiers are hard coded to return success).  The next oddity is that the stage1 image, which should really just be the PEI core actually contains another boot from scratch SEC phase, except this one is based on the standard IA32 reset vector code plus a magic QuarkSecLib and then the PEI code.  This causes the stage1 bring up to be different as well, because usually, the SEC code locates the PEI core in stage1 and loads, relocates and executes it starting from the entry point PeiCore().  However, quark doesn’t do this at all.  It relies on the Firmware Volume generator populating the first ZeroVector (an area occupying the first 16 bytes of the Firmware Volume Header) with a magic entry (located in the ResetVector via the magic string ‘SPI Entry Point ‘ with the trailing space).  The SEC code indirects through the ZeroVector to this code and effectively re-initialises the stack and begins executing the new SEC code, which then locates the internal copy of the PEI core and jumps to it.

Secure Boot on the Intel Galileo

The first thing to do is set up a build environment.  The Board support package that Intel supplies comes with a vast set of instructions and a three stage build process that uses the standard edk2 build to create firmware volumes, rips them apart again then re-lays them out using spi-flashtools to include the Arduino payload (grub, the linux kernel, initrd and grub configuration file), adds signed headers before creating a firmware flash file with no platform data and then as a final stage adds the platform data.  This is amazingly painful once you’ve done it a few times, so I wrote my own build script with just the essentials for building a debug firmware for my board (it also turns out there’s a lot of irrelevant stuff in quarkbuild.sh which I dumped).  I’m also a git person, not really an svn one, so I redid the Quark_EDKII tree as a git tree with full edk2 and FatPkg as git submodules and a single build script (build.sh) which pulls in all the necessary components and delivers a flashable firmware file.  I’ve linked the submodules to the standard tianocore github ones.  However, please note that the edk2 git pack is pretty huge and it’s in the nature of submodules to be volatile (you end up blowing them away and re-pulling the entire repo with git submodule update a lot, especially because the build script ends up patching the submodules) so you’ll want to clone your own edk2 tree and then add the submodule as a reference.  To do this, execute

git submodule init
git submodule update --reference <wherever you put your local edk2 tree> .module/edk2

before running build.sh; it will save a lot of re-cloning of the edk2 tree. You can also do this for FatPkg, but it’s tiny by comparison and not updated by the build scripts.

A note on the tree format: the Intel supplied Quark_EDKII is a bit of a mess: one of the requirements for the edk2 build system is that some files have to have dos line endings and some have to have unix.  Someone edited the Quark_EDKII less than carefully and the result is a lot of files where emacs shows splatterings of ^M, which annoys me incredibly so I’ve sanitised the files to be all dos or all unix line endings before importing.

A final note on the payload:  for the purposes of UEFI testing, it doesn’t matter and it’s another set of code to download if you want the Arduino payload, so the layout in my build just adds the EFI Shell as the payload for easy building.  The Arduino sections are commented out in the layout file, so you can build your own Arduino payload if you want (as long as you have the necessary binaries).  Also, I’m building for Galileo Kips Bay D … if you have a gen 2, you need to fix the platform-data.ini file.

An Aside about vga over serial consoles

If, like me, you’re interested in the secure boot aspect, you’re going to be spending a lot of time interacting with the VGA console over the serial line and you want it to look the part.  The default VGA PC console is very much stuck in an 80s timewarp.  As a result, the world has moved on and it hasn’t leading to console menus that look like this

vgaThis image is directly from the Intel build docs and actually, this is Windows, which is also behind the times9; if you fire this up from an xterm, the menu looks even worse.  To get VGA looking nice from an xterm, first you need to use a vga font (these have the line drawing characters in the upper 128 bytes), then you need to turn off UTF-8 (otherwise some of the upper 128 character get seen as UTF-8 encodings), turn off the C1 control characters and set a keyboard mapping that sends what UEFI is expecting as F7.  This is what I end up with

xterm -fn vga -geometry 80x25 +u8 -kt sco -k8 -xrm "*backarrowKeyIsErase: false"

And don’t expect the -fn vga to work out of the box on your distro either … vga fonts went out with the ark, so you’ll probably have to install it manually.  With all of this done, the result is

vga-xtermAnd all the function keys work.

Back to our regularly scheduled programming: Secure Boot

The Quark build environment comes with a SECURE_LD mode, which, at first sight, does look like secure boot.  However, what it builds is a cryptographic verification system for the firmware payloads (and disables the shell for good measure).  There is also an undocumented SECURE_BOOT build define instead; unfortunately this doesn’t even build (it craps out trying to add a firmware menu: the Quark is embedded and embedded don’t have firmware menus). Once this is fixed, the image will build but it won’t boot properly.  The first thing to discover is that ASSERT() fails silently on this platform.  Why? Well if you turn on noisy assert, which prints the file and line of the failure, you’ll find that the addition of all these strings make the firmware volumes too big to fit into their assigned spaces … Bummer.  However printing nothing is incredibly useless for debugging, so just add a simple print to let you know when ASSERT() failure is hit.  It turns out there are a bunch of wrong asserts, including one put in deliberately by intel to prevent secure boot from working because they haven’t tested it.  Once you get rid of these,  it boots and mostly works.

Mostly because there’s one problem: authenticating by enrolled binary hashes mostly fails.  Why is this?  Well, it turns out to be a weakness in the Authenticode spec which UEFI secure boot relies on.  The spec is unclear on padding and alignment requirements (among a lot of other things; after all, why write a clear spec when there’s only going to be one implementation … ?).  In the old days, when we used crcs, padding didn’t really matter because additional zeros didn’t affect the checksum but in these days of cryptographic hashes, it does matter.  The problem is that the signature block appended to the EFI binary has an eight byte alignment.  If you add zeros at the end to achieve this, those zeros become part of the hash.  That means you have to hash IA32 binaries as if those padded zeros existed otherwise the hash is different for signed and unsigned binaries. X86-64 binaries seem to be mostly aligned, so this isn’t noticeable for them.  This problem is currently inherent in edk2, so it needs patching manually in the edk2 tree to get this working.

With this commit, the Quark build system finally builds a working secure boot image.  All you need to do is download the ia32 efitools (version 1.5.2 or newer — turns out I also didn’t compute hashes correctly) to an sd card and you’re ready to play.

Adventures in Embedded UEFI with Intel Galileo

At one of the Intel Technology Days conferences a while ago, Intel gave us a gift of a Galileo board, which is based on the Quark SoC, just before the general announcement.  The promise of the Quark SoC was that it would be a fully open (down to the firmware) embedded system based on UEFI.  When the board first came out, though, the UEFI code was missing (promised for later), so I put it on a shelf and forgot about it.   Recently, the UEFI Security Subteam has been considering issues that impinge on embedded architectures (mostly arm) so having an actual working embedded development board could prove useful.  This is the first part of the story of trying to turn the Galileo into an embedded reference platform for UEFI.

The first problem with getting the Galileo working is that if you want to talk to the UEFI part it’s done over a serial interface, with a 3.5″ jack connection.  However, a quick trip to amazon solved that one.  Equipped with the serial interface, it’s now possible to start running UEFI binaries.   Just using the default firmware (with no secure boot) I began testing the efitools binaries.  Unfortunately, they were building the size of the secure variables (my startup.nsh script does an append write to db) and eventually the thing hit an assert failure on entering the UEFI handoff.  This led to the discovery that the recovery straps on the board didn’t work, there’s no way to clear the variable NVRAM and the only way to get control back was to use an external firmware flash tool.  So it’s off for an unexpected trip to uncharted territory unless I want the board to stay bricked.

The flash tool Intel recommends is the Dediprog SF 100.  These are a bit expensive (around US$350) and there’s no US supplier, meaning you have to order from abroad, wait ages for it to be delivered and deal with US customs on import, something I’ve done before and have no wish to repeat.  So, casting about for a better solution, I came up with the Bus Pirate.  It’s a fully open hardware component (including a google code repository for the firmware and the schematics) plus it’s only US$35 (that’s 10x cheaper than the dediprog), available from Amazon and works well with Linux (for full disclosure, Amazon was actually sold out when I bought mine, so I got it here instead).

The Bus Pirate comes as a bare circuit board (no case or cables), so you have to buy everything you might need to use it extra.  I knew I’d need a ribbon cable with SPI plugs (the Galileo has an SPI connector for the dediprog), so I ordered one with the card.  The first surprise when the card arrived was that the USB connector is actually Mini B not the now standard Micro connector.  I’ve not seen one of those since I had an Android G1, but, after looking in vain for my old android one, Staples still has the cables.    The next problem is that, being open hardware, there are multiple manufacturers.  They all supply a nice multi coloured ribbon cable, but there are two possible orientations and both are produced.  Turns out I have a sparkfun cable which is the opposite way around from the colour values in the firmware (which is why the first attempt to talk to the chip didn’t work).  The Galileo has diode isolators so the SPI flash chip can be powered up and operated independently by the Bus Pirate;  accounting for cable orientation, and disconnecting the Galileo from all other external power, this now works.  Finally, there’s a nice Linux project, flashrom, which allows you to flash all manner of chips and it has a programmer mode for the Bus Pirate.  Great, except that the default USB serial speed is 115200 and at that baud rate, it takes about ten minutes just to read an 8MB SPI flash (flashrom will read, program and then verify, giving you about 25 mins each time you redo the firmware).  Speeding this up is easy: there’s an unapplied patch to increase the baud rate to 2Mbit and I wrote some code to flash only designated areas of the chip (note to self: send this upstream).  The result is available on the OpenSUSE build service.  The outcome is that I’m now able to build and reprogram the firmware in around a minute.

By now this is two weeks, a couple of hacks to a tool I didn’t know I’d need and around US$60 after I began the project, but at least I’m now an embedded programmer and have the scars to prove it.  Next up is getting Secure Boot actually working ….

efitools now working for both x86 and x86_64

Some people noticed a while ago that version 1.5.1 was building for ia32, but 1.5.2 is the release that’s tested and works (1.5.1 had problems with the hash computations).  The primary reason for this work is bringing up secure boot on the Intel Quark SoC board (I’ve got the Galileo Kipps Bay D but the link is to the new Gen 2 board).  There’s a corresponding release of OVMF, with a fix for a DxeImageVerification problem that meant IA32 hashes weren’t getting computed properly.   I’ll post more on the Galileo and what I’ve been doing with it in a different post (tagged for embedded, since it’s mostly a story of embedded board programming).

With these tools, you can now test out IA32 images for both UEFI and the key manipulation and signing tools.

UEFI Secure Boot Tools Updated for 2.4

UEFI 2.4 has been out for a while (18 months to be exact).  However, it’s taken me a while to redo the tianocore builds for the 2.4 base.  This is now done, so the OVMF packages are now building 2.4 roms

https://build.opensuse.org/package/show/home:jejb1:UEFI/OVMF

Please remember that this is bleeding edge.  I found a few bugs while testing the tools, so there are likely many more lurking in there.  I’ve also updated the tools package

https://build.opensuse.org/package/show/home:jejb1:UEFI/efitools

As of version 1.5.1, there’s a new generator for hashed revocation certificates and KeyTool now supports the timestamp signature database.  I’ve also published a version of sbsigntools

https://build.opensuse.org/package/show/home:jejb1:UEFI/sbsigntools

Which, as of 0.7, has support for multiple signatures.

Problems with TianoCore after multi-sign (r14141) Fixed

For technical reasons, all of my tools broke with all versions of TianoCore after r14141 (Update the DxeImageVerificationLib to support for Authenticode-signed UEFI images with multiple signatures.)  What actually happened is that the multi signature verification code got stricter on the alignment requirements for signatures.  The current sbsigntools (and even pesign) simply slapped the signature block immediately at the end of the binary.  Unfortunately this meant that most of the time it wasn’t actually aligned on a long word boundary meaning that most signatures with old versions of sbsign and pesign start giving security violations on new TianoCore platforms.  I’ve fixed sbsigntools to pad the end of the binary and ensure that the signature block always starts on a long word alignment and verified that the signatures now work again with the latest versions of TianoCore.

I’ve updated the OVMF, efitools and sbsigntools packages with fixes for this problem and they should now be propagating through the system.

efitools 1.4 with linux key manipulation utilities released

This is a short post to announce the release of efitools version 1.4. The packages are available here:

http://download.opensuse.org/repositories/home:/jejb1:/UEFI/

Just select your distribution (various versions of Debian, Ubuntu, Fedora and openSUSE).

The main additions over version 1.3 is that there are two new utilities: efi-readvar and efi-updatevar that allow you to read and manipulate the UEFI signatures database from linux userspace.  The disadvantage of these tools is that they require the new efivarfs filesystem to be mounted somewhere in the system (efivarfs was added in kernel 3.8, so you either have to be running a very recent distribution, like openSUSE Tumbleweed, or you have to build your own kernels to use it).    To mount the efivarfs filesystem, just do

modprobe efivars

mount -o efivarfs none /mnt (or some other useful mount point)

As long as this is successful, you can now use the efi- tools to read and write the secure variables.  In order to write the variables in User Mode, you must have the private part of the Platform and Key Exchange Keys available.  So, assuming your private part of the platform key is PK.key, you can add your own KEK with

efi-updatevar -a -c KEK.crt -k PK.key KEK

And then add your own signature database key with

efi-updatevar -a -c DB.crt -k KEK.key db

Where KEK.crt and DB.crt are X509 certificates.

Note that by default, files in the efivarfs filesystem are owned by root and have permission 0644, so efi-readvar can be executed by any user, but efi-updatevar needs to be able to write to the variable files (i.e. would have to run as root).

There’s also extensive man pages documenting all the options for both efi-readvar and efi-updatevar.