In the previous post I looked at how you build an encrypted image that can maintain its confidentiality inside AMD SEV or Intel TDX. In this post I’ll discuss how you actually bring up a confidential VM from an encrypted image while preserving secrecy. However, first a warning: This post represents the state of the art and includes patches that are certainly not deployed in distributions and may not even be upstream, so if you want to follow along at home you’ll need to patch things like qemu, grub and OVMF. I should also add that, although I’m trying to make everything generic to confidential environments, this post is based on AMD SEV, which is the only confidential encrypted1 environment currently shipping.
The Basics of a Confidential Computing VM
At its base, current confidential computing environments are about using encrypted memory to run the virtual machine and guarding the encryption key so that the owner of the host system (the cloud service provider) can’t get access to it. Both SEV and TDX have the encryption technology inside the main memory controller meaning the L1 cache isn’t encrypted (still vulnerable to cache side channels) and DMA to devices must also be done via unencryped memory. This latter also means that both the BIOS and the Operating System of the guest VM must be enlightened to understand which pages to encrypted and which must not. For this reason, all confidential VM systems use OVMF2 to boot because this contains the necessary enlightening. To a guest, the VM encryption looks identical to full memory encryption on a physical system, so as long as you have a kernel which supports Intel or AMD full memory encryption, it should boot.
Each confidential computing system has a security element which sits between the encrypted VM and the host. In SEV this is an aarch64 processor called the Platform Security Processor (PSP) and in TDX it is an SGX enclave running Intel proprietary code. The job of the PSP is to bootstrap the VM, including encrypting the initial OVMF and inserting the encrypted pages. The security element also includes a validation certificate, which incorporates a Diffie-Hellman (DH) key. Once the guest owner obtains and validates the DH key it can use it to construct a one time ECDH encrypted bundle that can be passed to the security element on bring up. This bundle includes an encryption key which can be used to encrypt secrets for the security element and a validation key which can be used to verify measurements from the security element.
The way QEMU boots a Q35 machine is to set up all the configuration (including a disk device attached to the VM Image) load up the OVMF into rom memory and start the system running. OVMF pulls in the QEMU configuration and constructs the necessary ACPI configuration tables before executing grub and the kernel from the attached storage device. In a confidential VM, the first task is to establish a Guest Owner (the person whose encrypted VM it is) which is usually different from the Host Owner (the person running or controlling the Physical System). Ownership is established by transferring an encrypted bundle to the Secure Element before the VM is constructed.
The next step is for the VMM (QEMU in this case) to ask the secure element to provision the OVMF Firmware. Since the initial OVMF is untrusted, the Guest Owner should ask the Secure Element for an attestation of the memory contents before the VM is started. Since all paths lead through the Host Owner, who is also untrusted, the attestation contains a random nonce to prevent replay and is HMAC’d with a Guest Supplied key from the Launch Bundle. Once the Guest Owner is happy with the VM state, it supplies the Wrapped Key to the secure element (along with the nonce to prevent replay) and the Secure Element unwraps the key and provisions it to the VM where the Guest OS can use it for disc encryption. Finally, the enlightened guest reads the encrypted disk to unencrypted memory using DMA but uses the disk encryptor to decrypt it to encrypted memory, so the contents of the Encrypted VM Image are never visible to the Host Owner.
The Gaps in the System
The most obvious gap is that EFI booting systems don’t go straight from the OVMF firmware to the OS, they have to go via an EFI bootloader (grub, usually) which must be an efi binary on an unencrypted vFAT partition. The second gap is that grub must be modified to pick the disk encryption key out of wherever the Secure Element has stashed it. The third is that the key is currently stashed in VM memory before OVMF starts, so OVMF must know not to use or corrupt the memory. A fourth problem is that the current recommended way of booting OVMF has a flash drive for persistent variable storage which is under the control of the host owner and which isn’t part of the initial measurement.
Plugging The Gaps: OVMF
To deal with the problems in reverse order: the variable issue can be solved simply by not having a persistent variable store, since any mutable configuration information could be used to subvert the boot and leak the secret. This is achieved by stripping all the mutable variable handling out of OVMF. Solving key stashing simply means getting OVMF to set aside a page for a secret area and having QEMU recognise where it is for the secret injection. It turns out AMD were already working on a QEMU configuration table at a known location by the Reset Vector in OVMF, so the secret area is added as one of these entries. Once this is done, QEMU can retrieve the injection location from the OVMF binary so it doesn’t have to be specified in the QEMU Machine Protocol (QMP) command. Finally OVMF can protect the secret and package it up as an EFI configuration table for later collection by the bootloader.
The final OVMF change (which is in the same patch set) is to pull grub inside a Firmware Volume and execute it directly. This certainly isn’t the only possible solution to the problem (adding secure boot or an encrypted filesystem were other possibilities) but it is the simplest solution that gives a verifiable component that can be invariant across arbitrary encrypted boots (so the same OVMF can be used to execute any encrypted VM securely). This latter is important because traditionally OVMF is supplied by the host owner rather than being part of the VM image supplied by the guest owner. The grub script that runs from the combined volume must still be trusted to either decrypt the root or reboot to avoid leaking the key. Although the host owner still supplies the combined OVMF, the measurement assures the guest owner of its correctness, which is why having a fairly invariant component is a good idea … so the guest owner doesn’t have potentially thousands of different measurements for approved firmware.
Plugging the Gaps: QEMU
The modifications to QEMU are fairly simple, it just needs to scan the OVMF file to determine the location for the injected secret and inject it correctly using a QMP command.. Since secret injection is already upstream, this is a simple find and make the location optional patch set.
Plugging the Gaps: Grub
Grub today only allows for the manual input of the cryptodisk password. However, in the cloud we can’t do it this way because there’s no guarantee of a secure tty channel to the VM. The solution, therefore, is to modify grub so that the cryptodisk can use secrets from a provider, in addition to the manual input. We then add a provider that can read the efi configuration tables and extract the secret table if it exists. The current incarnation of the proposed patch set is here and it allows cryptodisk to extract a secret from an efisecret provider. Note this isn’t quite the same as the form expected by the upstream OVMF patch in its grub.cfg because now the provider has to be named on the cryptodisk command line thus
cryptodisk -s efisecret
but in all other aspects, Grub/grub.cfg works. I also discovered several other deviations from the initial grub.cfg (like Fedora uses /boot/grub2 instead of /boot/grub like everyone else) so the current incarnation of grub.cfg is here. I’ll update it as it changes.
Putting it All Together
Once you have applied all the above patches and built your version of OVMF with grub inside, you’re ready to do a confidential computing encrypted boot. However, you still need to verify the measurement and inject the encrypted secret. As I said before, this isn’t easy because, due to replay defeat requirements, the secret bundle must be constructed on the fly for each VM boot. From this point on I’m going to be using only AMD SEV as the example because the Intel hardware doesn’t yet exist and AMD kindly gave IBM research a box to play with (Anyone with a new EPYC 7xx1 or 7xx2 based workstation can likely play along at home, but check here). The first thing you need to do is construct a launch bundle. AMD has a tool called sev-tool to do this for you and the first thing you need to do is obtain the platform Diffie Hellman certificate (pdh.cert). The tool will extract this for you
Or it can be given to you by the cloud service provider (in this latter case you’ll want to verify the provenance using sevtool –validate_cert_chain, which contacts the AMD site to verify all the details). Once you have a trusted pdh.cert, you can use this to generate your own guest owner DH cert (godh.cert) which should be used only one time to give a semblance of ECDHE. godh.cert is used with pdh.cert to derive an encryption key for the launch bundle. You can generate this with
sevtool --generate_launch_blob <policy>
The gory details of policy are in the SEV manual chapter 3, but most guests use 1 which means no debugging. This command will generate the godh.cert, the launch_blob.bin and a tmp_tk.bin file which you must save and keep secure because it contains the Transport Encryption and Integrity Keys (TEK and TIK) which will be used to encrypt the secret. Figuring out the qemu command line options needed to launch and pause a SEV guest is a bit of a palaver, so here is mine. You’ll likely need to change things, like the QMP port and the location of your OVMF build and the launch secret.
Finally you need to get the launch measure from QMP, verify it against the sha256sum of OVMF.fd and create the secret bundle with the correct GUID headers. Since this is really fiddly to do with sevtool, I wrote this python script3 to do it all (note it requires qmp.py from the qemu git repository). You execute it as
sevsecret.py --passwd <disk passwd> --tiktek-file <location of tmp_tk.bin> --ovmf-hash <hash> --socket <qmp socket>
And it will verify the launch measure and encrypt the secret for the VM if the measure is correct and start the VM. If you got everything correct the VM will simply boot up without asking for a password (if you inject the wrong secret, it will still ask). And there you have it: you’ve booted up a confidential VM from an encrypted image file. If you’re like me, you’ll also want to fire up gdb on the qemu process just to show that the entire memory of the VM is encrypted …
Conclusions and Caveats
The above script should allow you to boot an encrypted VM anywhere: locally or in the cloud, provided you can access the QMP port (most clouds use libvirt which introduces yet another additional layering pain). The biggest drawback, if you refer to the diagram, is the yellow box: you must trust the secret element, which in both Intel and AMD is proprietary4, in order to get confidential computing to work. Although there is hope that in future the secret element could be fully open source, it isn’t today.
The next annoyance is that launching a confidential VM is high touch requiring collaboration from both the guest owner and the host owner (due to the anti-replay nonce). For a single launch, this is a minor annoyance but for an autoscaling (launch VMs as needed) platform it becomes a major headache. The solution seems to be to have some Hardware Security Module (HSM), like the cloud uses today to store encryption keys securely, and have it understand how to measure and launch encrypted VMs on behalf of the guest owner.
The final conclusion to remember is that confidentiality is not security: your VM is as exploitable inside a confidential encrypted VM as it was outside. In many ways confidentiality and security are opposites, in that security in part requires reducing the trusted code and confidentiality requires pulling as much as possible inside. Confidential VMs do have an answer to the Cloud trust problem since the enterprise can now deploy VMs without fear of tampering by the cloud provider, but those VMs are as insecure in the cloud as they were in the Enterprise Data Centre. All of this argues that Confidential Computing, while an important milestone, is only one step on the journey to cloud security.
The OVMF patches are upstream (including modifications requested by Intel for TDX). The QEMU and grub patch sets are still on the lists.