Category Archives: Linux

General Linux and Open Source related blog posts

Owning Your Own Copyrights in Open Source

This article covers several aspects: owning the copyrights you develop outside of your employed time and the more thorny aspect of owning the copyrights in open source projects you work on for your employer. It will also take a look at the middle ground of being a contract entity doing paid work on open source. This article follows the historical sweep of my journey through this field and so some aspects may be outdated and all are within the bounds of the US legal system and it’s most certainly not complete, just a description of what I did and what I learned.

Why Should you Own your Own Source code?

In the early days of open source, everything was a hobby project and everyone owned their own contributions. Owning your own contribution was a sort of mark of franchise in the project. Of course, there were some projects, notably the FSF ones, which didn’t believe in distributed ownership and insisted you contribute ownership of your copyrights to them so they could look after the project for you. Obviously, since I’m a Linux Kernel developer and with the Linux Kernel being a huge distributed copyright project, it’s easy to see which side of the argument I fall.

The main rights you give up if you don’t own the code you create are the right to re-licence and the right to enforce. It probably hadn’t occurred to you that if you actually find a licence violation in a project you contribute to for your employer, you’ll have no standing to demand that the problem get addressed. In fact, any enforcement on the code would have to be done by the proper owner: your employer. Plus your employer can control the ultimate destination of that ownership, including selling your code to a copyright troll if they so wished … while you may trust your employer now you work for them, do you trust them to do the right thing for all time, especially since they may be bought out by EvilCorp on down the road?

The relicensing problem can also be thorny: as a strong open source contributor you’ve likely been on the receiving end of requests to relicense (“I really like the code in your project X and would like to incorporate it in my open source project Y, but there’s a licence compatibility problem, would you dual license it?”) and thought nothing about saying “yes”. However, if your employer owns the code, you were likely lying when you said “yes” because you have no relicensing rights and you must ask your employer for permission to do the relicensing.

All the above points up the dangers in the current ecosystem. Project contributors often behave like they own the code but if they don’t they can be leaving a legal minefield in their wakes. The way to fix this is to own your own code … or at least understand the limitations of your rights if you don’t.

Open Source in Your Own Time

It’s a mistake to think that just because you work on something in your own time it isn’t actually owned by your employer. Historically, at least in the US, employment agreements contain incredibly broad provisions for invention ownership which basically try to claim anything you invent at any hour of the day or night that might be even vaguely related to your employment. Not unnaturally this caused huge volumes of litigation around startups where former employees successfully develop innovations their prior employer declined to pursue (at least until it started making money). This has lead to a slew of state based legal safe harbour protections for employee inventions. Most of them, like the Illinois Statute I first used, have similar wording

A provision in an employment agreement which provides that an employee shall assign or offer to assign any of the employee’s rights in an invention to the employer does not apply to an invention for which no equipment, supplies, facilities, or trade secret information of the employer was used and which was developed entirely on the employee’s own time … is … void and unenforceable.

765 ILCS 1060/2

In fact most states now require the wording to appear in the employment contract, so you likely don’t have to look up the statute to figure out what to do. The biggest requirements are that it be on your own time and you not be using any employer equipment, so the most important thing is to make sure you have your own laptop or computer. If you follow the requirements to the letter, you should be safe enough in owning your own time open source code. However, if you really want a guarantee you need to take extra precautions.

Own Time Open Source Carve Outs in Employment agreements

When you join a company, one of the things you’ll sign is a prior invention disclosure form, usually as an appendix to the invention assignment agreement as part of your employment contract. Here’s an example one from the SEC database (ironically for a Chinese subsidiary). Look particularly at section 2(a) “Inventions Retained and Licensed”. It’s basically pure CYA for the company, and most people leave Exhibit A blank, but you shouldn’t do that. What you should do is list all your current and future (by doing sweeping guesswork) own open source projects. The most useful clause in 2(a) says “I agree that I will not incorporate any Prior Inventions into any products …” so you and your employer have now agreed that all the listed projects are outside the scope of your employment agreement.

As far as I can tell, no-one really looks at Exhibit A at all, so I’ve been really general and put things like “The Linux Kernel” and “Open Source UEFI software” “Open Source cryptography such as gnupg, openssl and gnutls” and never been challenged on it.

One legitimate question, which will probably happen if your carve outs are very broad, is what happens if your employer specifically asks you to work on a project you’ve declared in Exhibit A? Ideally you could use this as an opportunity to negotiate an addendum to your contract covering your ownership of open source. However, if you don’t want to rock the boat, you can simply do nothing and rely on the fact that the agreement has something to say about this. The sample section 2(a) above goes on to give your employer a non-exclusive licence, which you could take as agreement to your continued ownership of the copyrights in the code, even through your employer is now instructing you (and paying you) to work on it. However, the say nothing approach has never been tested in court and may be vulnerable to challenge, so a safer course is to send your manager an email pointing out the issue and proposing to follow the licence in the employment contract. If they do nothing, thinking the matter settled, as most managers do, then you have legal cover for continuing to own your own copyrights. You can make it as vague as you like, so using the above sample agreement, something like “You’ve asked me to work on Project X which was listed in Exhibit A of my employment agreement. To move forward, I’m happy to licence all future works on this project to you under the terms of section 2(a)”. It looks innocuous, but it’s actually a statement that your company doesn’t get copyright ownership because of the actual wording in section 2(a) says the company gets a non-exclusive licence if you incorporate any works listed in Exhibit A. Remember to save the email somewhere safe (and any reply which is additional proof it was seen) just in case.

Owning Open Source Produced on Company Time

The first thing to note is that if your employer pays for you to work on open source, absent any side agreement, the code that you produce will be owned by your employer. This isn’t some US specific thing, this is a general principle of employment the world over (they pay you, so they own it). So even if you work in Europe, your employer will still own your open source copyrights if they pay you to work on the project, moral rights arguments notwithstanding. The only way to change this is to get some sort of explicit or implicit (if you want to go the carve out route above) agreement about the ownership.

Although I’ve negotiated both joint and exclusive ownership of open source via employment agreements, the actual agreements are still the property of the relevant corporations and thus, unfortunately, while I can describe some of the elements, I can’t publish the text (employment agreements are the crown jewels the HR dragons guard).

How to Negotiate

Most employers (or at least their lawyers) will refuse point blank to change the wording of employment agreements. However, what you want can be a side agreement and usually doesn’t require rewording the employment agreement at all. All you need is the understanding that the side agreement will get executed. One big problem can be that most negotiations over employment agreements occur with people from HR, which is a department with the least understanding of open source, so you don’t want to be negotiating the side agreement with them, you want to talk to the person that is hiring you. You also need to present your request as reasonable, so find out if anyone inside your prospective new company has done something similar. Often they have, and they’ll likely be someone in open source you’ve at least heard of so you can approach them and ask for details. “But you gave a copyright ownership side agreement to X” is often a great way to advance your cause. Don’t be afraid to ask and argue politely but firmly … hiring talented developers is very competitive nowadays so they have (or at least the manager who wants to hire you has) a vested interest in keeping you happy.

Consider Joint Ownership

Joint ownership is a specific legal term meaning the rights in a copyright are shared by the joint owners. Effectively this sharing means that either party may enforce without consulting the other and either party may license the work without consulting the other (but here they must share any profits from the licence equally among joint owners).

Joint ownership is often a good solution because it gives you the right to relicence and the right to enforce, while also giving your employer a share in what they paid to produce. Joint ownership is often far easier to sell to corporations than one or other of you having exclusive ownership because it gives them all the rights they would have had anyway. The only slight concern you may have down the road is it does give them the right to relicence or sell on their ownership, say to an open core business or to an enforcement troll. However, the good news is that as joint owner you now have a right to a half share of any profit they (and the new owner) make out of such a rights transfer, which can potentially act as a deterrent to the transaction if you remind them of this requirement.

Open Source as a Contractor

In some ways this is the best relationship. There are no work for hire assumptions about companies you contract for owning your free time, so doing other open source projects is easy. However, a contractor is bound by whatever contract you sign, so you need someone with legal training to help you make sure it is actually equitable. You can’t get around this legal requirement: the protections that exist for employees don’t exist for contractors, so if you sign a contract saying in exchange for a certain sum company X owns the entirety of your output, you will be bound by it. So remember: read the contract and negotiate the terms.

Copyright Ownership as a Contractor

Surprisingly, in a relationship where you’re contracted to get something upstream, it’s often in the client’s best interest to have the contractor own the copyrights in Open Source. It means the contractor is responsible for all the nitty gritty of pushing patches and dealing with contribution agreements and the client simply gets the end product: the thing they wanted upstream. I’ve found this a surprisingly easy sell to most legal departments. Even if the client does want some sort of ownership of the code, you can offer joint ownership as the easy route to you taking on all the hassle and them getting the benefits of ownership.

Trade Secrets

As a contractor, you’ll likely be forced to sign an NDA never to reveal client secrets. This is pretty usual, but the pitfall in open source, particularly if you’re doing a driver for a device whose programming manual is under NDA, is that you are going to be revealing them contrary to the NDA. You need this handled in an equitable fashion in the contract to avoid unpleasant problems long after the job is done. The simplest phrase you need is something like “Client understands that open source is developed in public and authorizes that all information necessary to producing X under this contract be disclosed to the public”.

Patents

Patents can be a huge minefield with contract open source, because as a contractor who owns the copyrights and negotiates the contribution agreements, you have no authority to bind your client’s patents. You really don’t want to find yourself being used as a conduit for a patent ambush on open source (where a client contracts with you to put code into a project which reads on a patent they hold and then turns around and patent trolls the ecosystem) so you need contract language binding the client patents at least in the work you’re doing for them. Something simple like “Client grants a perpetual and irrevocable licence, consistent with the terms of the open source licence for X, to all contributions made by contractor to X that read on patents client holds now or may in future acquire”. This latter is pretty narrow, so you could start out by trying to get a patent licence for the entirety of project X and negotiate down from there.

Conclusions

Owning your own copyrights in open source is possible provided you’re careful. The strategies outlined above are based on my own experiences (all in the US) as a contract employee from 1995-2008 there after as a regular employee but are not the only ones you could pursue, so ask around to see what others have done as well. The main problem with all the strategies above is that they work well when you’re negotiating your employment. If you’re already working at some corporation they’re unlikely to be helpful to you unless you really have a simple own time open source project. Oh, and just remember that while the snippets I quoted above for the contract case may actually have been in contracts I signed, this isn’t legal advice and you should have a lawyer advise you how best to incorporate the various points raised.

Papering Over our TPM 2.0 TSS Divisions

For years I’ve been hoping that the Trusted Computing Group (TCG) based IBM and Intel TSS (TCG Software Stack) would simply integrate with one another into a single package. The rationale is pretty simple: the Intel TSS is already quite a large collection of libraries so adding one more (the IBM TSS has a single library) wouldn’t be too much of a burden. Both TSSs are based on TCG specifications, except that the IBM TSS is based on the TPM 2.0 Library Specification and the Intel TSS is based on the TPM Software Stack (also, not at all confusingly, abbreviated TSS). There’s actually very little overlap between these specifications so co-existence seems very reasonable. Before we get into the stories of these two stacks and what they do, I should confess my biases: while I’ve worked with the TCG over the years, I’ve always harboured the view that the complete lack of adoption of TPM 2.0’s predecessor (TPM 1.2) was because of the hugely complicated nature of the TCG mandated software stack which was implemented in Linux by trousers. It is my firm belief that the complexity of the API lead to the lack of uptake, even though I made several efforts over the years to make use of it.

My primary interest in the TPM has been as a secure laptop keystore (since I already paid for a TPM, I didn’t see the need to fork out again for one of the new security dongles; plus the TPM is infinitely scalable in the number of keys, unlike most dongles). The key to making the TPM usable in this form is integration with existing Cryptographic systems (via plugins if they do them). Since openssl has an engine plugin, I’ve already produced an openssl TPM2 engine, patches for gnupg and engine integration patches for openvpn (upstream in 2.5) and openssh as well as a PKC11 exporter (to make file based engine keys exportable as PKCS11 tokens). Note a lot of the patches aren’t strictly TPM patches, they’re actually making openssl engines work in places they previously didn’t. However, the one thing most of the patches that actually touch the TPM have in common is that they have to pick one or other of the available TSSs to operate with. Before describing the TSS agnostic solution, lets look at why these two TSSs exist and what the difference is between them and why you might choose one over the other.

Schizophrenia at the TCG

As I said in the introduction, both TSSs are based on TCG specifications. These standards aren’t ambiguous: they lay out in excruciating detail what the header files are called and what the prototypes and structures have to be. Both TSS implementations are the way they are because they wouldn’t be following the standards if they deviated even slightly. The problem is the standards don’t agree with each other in meaningful ways. For instance the TPM Library standards define every structure in terms of the fundamental unit of TPM data: the TPM2B structure, which defines a 16 bit big endian length followed by a data unit of that length. The TPM Library standards (in Part 4 section 9.10.6) lay out that every TPM2B_X structure shall be a union of a ‘b’ element which is a TPM2B and a ‘t’ element which is the actual structure. However the TPM Software Stack specification eliminates the plain TPM2B so every TPM2B_X structure in the latter specification are not unions, they are simply the ‘t’ form of the structure. This means that although TPM2B_X structures in each specification are byte for byte the same, they are definitionally different when written as C code and can’t be assigned to each other … oops. The TPM Library standard lays out additional structures for an elaborate calling convention for the TPM2_Command interfaces which are completely different from the ESYS_Command interfaces in the TPM Software Stack.

The reason it’s all done this way? well the specifications were built by completely different committees for what the committees saw as separate use cases, so they didn’t see a need to reconcile the differences. As long as the definitions were byte for byte compatible, everything would work out correctly on the wire. The problem was the TPM Library specification was released nearly a decade ahead of the TPM Software Stack specification, so the first TSS created had to follow the former because the latter didn’t exist.

Sessions, HMAC and Encryption

One of the perennial problems of a TPM is that integrity and security of the information going over the wire is the responsibility of the user. However, the encryption and integrity computations involved, particularly the key derivations, are incredibly involved (even though well documented in the TPM Library specification, so naturally everyone would like the TSS to do this. The problem the TPM Secure Stack had is that all the way up to its ESAPI specification, the security and integrity computations were still the responsibility of the user, so it didn’t begin to be useful until ESAPI was finalized a couple of years ago.

The Resource Manager Problem

TPM 2.0 was designed to be far leaner in terms of resources than TPM 1.2, which meant there was a very small limit to the number of sessions and volatile objects it could contain at any one time. This necessitated the use of a “resource manager” to control access otherwise applications would get unexpected out of resource errors. The Intel TSS has its own resource manager. However, the Linux Kernel itself incorporated a resource manager in the TPM device in 4.12 and the IBM TSS avoids the need for its own resource manager by using this, and will, therefore not work correctly on earlier kernel versions.

Inside the IBM TSS

Even though the IBM TSS is based on a solid and easily comprehensible and detailed specification, that specification itself suffers from a couple of defects. The first being it assumes you’re submitting to a physical TPM, so the specification has no functional (library based) submission API for TPM commands, so the IBM TSS had to invent API it called TSS_Execute() which is a way of sending TPM commands directly to the physical TPM over the kernel’s device interfaces. Secondly, the standard contains no routing interfaces (telling it what destination the TPM is on: should it open the /dev/tpmrm0 device or send the commands to the TPM over an IP socket), so this is controlled in the IBM TSS by several environment variables (TPM_INTERFACE_TYPE, which can be either “dev” or “socsim” for either a physical device or a network socket. The endpoints being controlled by TPM_DEVICE for “dev” type, which specifies which device to use, defaulting to /dev/tpmrm0 or TPM_SERVER_NAME and TPM_PLAFORM_PORT for “socsim”).

The invented TSS_Execute() API also does all the encryption and HMAC parts necessary for secure and integrity verified communication with the TPM, so it acts as a fully functional TSS. The main drawback of the IBM TSS is that it stores essential information about the sessions and handles in files which will, by default, be dropped into the local directory. Most users of the IBM TSS have to set TPM_DATA_DIR to be a specially created directory under /tmp to avoid leaving messy artifacts in users home directories.

Inside the Intel TSS

The TPM Software Stack consists of a large number of different specifications, including the resource manager (which is now unnecessary for kernels above 4.12) the TCTI which specifies the routing information for the TPM. It turns out that even in the Intel TSS, environment variables are the most convenient form to specify this information but, unfortunately, the name of the environment variable has been left up to each use case instead of being standardised in the library meaning you’ll have to consult the man page to figure out what it is. The next set of standards: SAPI and ESAPI define functional interfaces to the TPM with one submission API for each command and additionally a corresponding ..._Async()/..._Finish() pair for asynchronous programming. The only real difference between SAPI and ESAPI is that the latter also does the necessary session cryptography for security and integrity, so it’s pretty much the only usable interface for TPM commands. Unfortunately, the ESAPI interface, as constructed by the TCG, has several cases of premature abstraction the worst of which is a separate abstraction for the TPM handle interface which lives only as long as the lifetime of the connection object and which necessitates multiple conversions to and from internal handle objects if your session or object lives longer than the connection (which can be the case).

There is one final wrinkle is that in the handle abstraction, ESAPI has no API for retrieving the real TPM handle. I’d always wondered why the Intel TSS tpm2 tools always saved the objects they create to a context instead of simply returning the handle to them, but this is the reason: without the ability to transform an internal handle to an external one, you either save the context or let the object die when the connection terminates. This problem is one forced by the ESAPI standard, but eventually it became enough of a problem that the Intel TSS introduced its own additional API to remedy.

The other major difference between the Intel and IBM TSSs is memory handling for returned results: The IBM TSS requires pre-allocated structures whereas the Intel TSS insists on allocation on return. It looks like the Intel TSS should be able to tell if the return pointer is allocated or NULL, but right at the moment it always allocates and overwrites the pointer.

Constructing a unifying Interface for both the IBM and Intel TSSs

In essence the process for converting something that runs with the IBM TSS to being TSS Agnostic is a fairly simple three step process which I’ll illustrate by reference to the openssl tpm2 engine which has already been converted:

  1. Hide the structural differences by inserting a set of macros: VAL() and VAL_2B() which hide most of the TCG induced structure schizophrenia.
  2. Convert the API call structure to be functional instead of via a single TSS_Execute() call. This is quite involved so I did it by adding tpm2_Function() wrappers for each specific invocation.
  3. Introduce the correct premature abstraction for internal and external representation of handles. This was the nastiest step for me because handles are stored in long lived engine structures, and the internal and external representations are both forms of uint32_t even in ESAPI (meaning the compiler won’t complain if you assign one to the other) so it was incredibly painful to get this conversion correct.

Once this is done, the remaining step was to introduce a header which did the impedance matching between the Intel and IBM TSSs and an autoconf macro to detect which TSS is installed and the resulting configure and compile just works. The resulting code will now build and run under either TSS. I should point out that the Intel TSS is missing several helper routines, but these are added into the intel-tss.h header file by copying the from the original IBM TSS. Finally an autoconf check is added to look for the missing internal to external handle transform, and everything is ready to go.

It does seem like it would be easier to port an existing Intel TSS application to the IBM TSS, since points 2 and 3 will already be sorted out. However, all the major TSS library using applications are IBM TSS based, so I haven’t actually been able to verify this.

Remaining Problems and Anomalies

The biggest remaining issue was the test scripts. The openssl TPM2 engine has 27 of them all told, all designed to check the engine function by invoking it via openssl when connected to a software TPM. These scripts are all highly dependent on the IBM TSS command line binaries and the Intel TSS versions seem to be very unstable in terms of argument structure making it pretty much impossible to convert, so I elected finally to have the tests run only if the IBM TSS CLI is installed. The next problem was that the Intel TSS version of the engine didn’t actually pass all the tests. However this was quickly narrowed down to a bug in the Intel TSS when using bound sessions on the NULL seed.

The sole remaining issue is a curious performance anomaly. When running time make check with the IBM TSS, the result is:

real 0m6.100s
user 0m2.827s
sys 0m0.822s

and the same command with the Intel TSS (running one fewer test and skipping the NULL seed) is:

real	0m10.948s
user	0m6.822s
sys	0m0.859s

Showing that the Intel TSS is nearly twice as slow as the IBM one with most of the time differential being user time. Since the tests use a software TPM which can perform the cryptographic operations at the speed of the main CPU, this is showing some type of issue with the command transmission system of the Intel TSS, likely having to do with the fact that most applications use synchronous TPM operations (the engine certainly does) but in the Intel TSS, the synchronous operations are implemented as the corresponding asynchronous pair. Regardless of the root cause, this is unlikely to be a problem with real world TPM crypto where the time taken for any operation will be dominated by the slowness of the physical TPM.

Conclusion

The TSS agnostic scheme adopted by the openssl TPM2 engine should be easily adaptable for all the other non-engine TPM code bases, and thus should pave the way for users not having to choose between applications which only support the Intel or IBM TSSs and can choose to install the best supported one on their distribution. The next steps are to investigate adapting this infrastructure to the existing gnupg patches (done and upstream) and also see if it can be used to solve the gnutls conundrum over supporting TPM based keys.

Deploying Encrypted Images for Confidential Computing

In the previous post I looked at how you build an encrypted image that can maintain its confidentiality inside AMD SEV or Intel TDX. In this post I’ll discuss how you actually bring up a confidential VM from an encrypted image while preserving secrecy. However, first a warning: This post represents the state of the art and includes patches that are certainly not deployed in distributions and may not even be upstream, so if you want to follow along at home you’ll need to patch things like qemu, grub and OVMF. I should also add that, although I’m trying to make everything generic to confidential environments, this post is based on AMD SEV, which is the only confidential encrypted1 environment currently shipping.

The Basics of a Confidential Computing VM

At its base, current confidential computing environments are about using encrypted memory to run the virtual machine and guarding the encryption key so that the owner of the host system (the cloud service provider) can’t get access to it. Both SEV and TDX have the encryption technology inside the main memory controller meaning the L1 cache isn’t encrypted (still vulnerable to cache side channels) and DMA to devices must also be done via unencryped memory. This latter also means that both the BIOS and the Operating System of the guest VM must be enlightened to understand which pages to encrypted and which must not. For this reason, all confidential VM systems use OVMF2 to boot because this contains the necessary enlightening. To a guest, the VM encryption looks identical to full memory encryption on a physical system, so as long as you have a kernel which supports Intel or AMD full memory encryption, it should boot.

Each confidential computing system has a security element which sits between the encrypted VM and the host. In SEV this is an aarch64 processor called the Platform Security Processor (PSP) and in TDX it is an SGX enclave running Intel proprietary code. The job of the PSP is to bootstrap the VM, including encrypting the initial OVMF and inserting the encrypted pages. The security element also includes a validation certificate, which incorporates a Diffie-Hellman (DH) key. Once the guest owner obtains and validates the DH key it can use it to construct a one time ECDH encrypted bundle that can be passed to the security element on bring up. This bundle includes an encryption key which can be used to encrypt secrets for the security element and a validation key which can be used to verify measurements from the security element.

The way QEMU boots a Q35 machine is to set up all the configuration (including a disk device attached to the VM Image) load up the OVMF into rom memory and start the system running. OVMF pulls in the QEMU configuration and constructs the necessary ACPI configuration tables before executing grub and the kernel from the attached storage device. In a confidential VM, the first task is to establish a Guest Owner (the person whose encrypted VM it is) which is usually different from the Host Owner (the person running or controlling the Physical System). Ownership is established by transferring an encrypted bundle to the Secure Element before the VM is constructed.

The next step is for the VMM (QEMU in this case) to ask the secure element to provision the OVMF Firmware. Since the initial OVMF is untrusted, the Guest Owner should ask the Secure Element for an attestation of the memory contents before the VM is started. Since all paths lead through the Host Owner, who is also untrusted, the attestation contains a random nonce to prevent replay and is HMAC’d with a Guest Supplied key from the Launch Bundle. Once the Guest Owner is happy with the VM state, it supplies the Wrapped Key to the secure element (along with the nonce to prevent replay) and the Secure Element unwraps the key and provisions it to the VM where the Guest OS can use it for disc encryption. Finally, the enlightened guest reads the encrypted disk to unencrypted memory using DMA but uses the disk encryptor to decrypt it to encrypted memory, so the contents of the Encrypted VM Image are never visible to the Host Owner.

The Gaps in the System

The most obvious gap is that EFI booting systems don’t go straight from the OVMF firmware to the OS, they have to go via an EFI bootloader (grub, usually) which must be an efi binary on an unencrypted vFAT partition. The second gap is that grub must be modified to pick the disk encryption key out of wherever the Secure Element has stashed it. The third is that the key is currently stashed in VM memory before OVMF starts, so OVMF must know not to use or corrupt the memory. A fourth problem is that the current recommended way of booting OVMF has a flash drive for persistent variable storage which is under the control of the host owner and which isn’t part of the initial measurement.

Plugging The Gaps: OVMF

To deal with the problems in reverse order: the variable issue can be solved simply by not having a persistent variable store, since any mutable configuration information could be used to subvert the boot and leak the secret. This is achieved by stripping all the mutable variable handling out of OVMF. Solving key stashing simply means getting OVMF to set aside a page for a secret area and having QEMU recognise where it is for the secret injection. It turns out AMD were already working on a QEMU configuration table at a known location by the Reset Vector in OVMF, so the secret area is added as one of these entries. Once this is done, QEMU can retrieve the injection location from the OVMF binary so it doesn’t have to be specified in the QEMU Machine Protocol (QMP) command. Finally OVMF can protect the secret and package it up as an EFI configuration table for later collection by the bootloader.

The final OVMF change (which is in the same patch set) is to pull grub inside a Firmware Volume and execute it directly. This certainly isn’t the only possible solution to the problem (adding secure boot or an encrypted filesystem were other possibilities) but it is the simplest solution that gives a verifiable component that can be invariant across arbitrary encrypted boots (so the same OVMF can be used to execute any encrypted VM securely). This latter is important because traditionally OVMF is supplied by the host owner rather than being part of the VM image supplied by the guest owner. The grub script that runs from the combined volume must still be trusted to either decrypt the root or reboot to avoid leaking the key. Although the host owner still supplies the combined OVMF, the measurement assures the guest owner of its correctness, which is why having a fairly invariant component is a good idea … so the guest owner doesn’t have potentially thousands of different measurements for approved firmware.

Plugging the Gaps: QEMU

The modifications to QEMU are fairly simple, it just needs to scan the OVMF file to determine the location for the injected secret and inject it correctly using a QMP command.. Since secret injection is already upstream, this is a simple find and make the location optional patch set.

Plugging the Gaps: Grub

Grub today only allows for the manual input of the cryptodisk password. However, in the cloud we can’t do it this way because there’s no guarantee of a secure tty channel to the VM. The solution, therefore, is to modify grub so that the cryptodisk can use secrets from a provider, in addition to the manual input. We then add a provider that can read the efi configuration tables and extract the secret table if it exists. The current incarnation of the proposed patch set is here and it allows cryptodisk to extract a secret from an efisecret provider. Note this isn’t quite the same as the form expected by the upstream OVMF patch in its grub.cfg because now the provider has to be named on the cryptodisk command line thus

cryptodisk -s efisecret

but in all other aspects, Grub/grub.cfg works. I also discovered several other deviations from the initial grub.cfg (like Fedora uses /boot/grub2 instead of /boot/grub like everyone else) so the current incarnation of grub.cfg is here. I’ll update it as it changes.

Putting it All Together

Once you have applied all the above patches and built your version of OVMF with grub inside, you’re ready to do a confidential computing encrypted boot. However, you still need to verify the measurement and inject the encrypted secret. As I said before, this isn’t easy because, due to replay defeat requirements, the secret bundle must be constructed on the fly for each VM boot. From this point on I’m going to be using only AMD SEV as the example because the Intel hardware doesn’t yet exist and AMD kindly gave IBM research a box to play with (Anyone with a new EPYC 7xx1 or 7xx2 based workstation can likely play along at home, but check here). The first thing you need to do is construct a launch bundle. AMD has a tool called sev-tool to do this for you and the first thing you need to do is obtain the platform Diffie Hellman certificate (pdh.cert). The tool will extract this for you

sevtool --pdh_cert_export

Or it can be given to you by the cloud service provider (in this latter case you’ll want to verify the provenance using sevtool –validate_cert_chain, which contacts the AMD site to verify all the details). Once you have a trusted pdh.cert, you can use this to generate your own guest owner DH cert (godh.cert) which should be used only one time to give a semblance of ECDHE. godh.cert is used with pdh.cert to derive an encryption key for the launch bundle. You can generate this with

sevtool --generate_launch_blob <policy>

The gory details of policy are in the SEV manual chapter 3, but most guests use 1 which means no debugging. This command will generate the godh.cert, the launch_blob.bin and a tmp_tk.bin file which you must save and keep secure because it contains the Transport Encryption and Integrity Keys (TEK and TIK) which will be used to encrypt the secret. Figuring out the qemu command line options needed to launch and pause a SEV guest is a bit of a palaver, so here is mine. You’ll likely need to change things, like the QMP port and the location of your OVMF build and the launch secret.

Finally you need to get the launch measure from QMP, verify it against the sha256sum of OVMF.fd and create the secret bundle with the correct GUID headers. Since this is really fiddly to do with sevtool, I wrote this python script3 to do it all (note it requires qmp.py from the qemu git repository). You execute it as

sevsecret.py --passwd <disk passwd> --tiktek-file <location of tmp_tk.bin> --ovmf-hash <hash> --socket <qmp socket>

And it will verify the launch measure and encrypt the secret for the VM if the measure is correct and start the VM. If you got everything correct the VM will simply boot up without asking for a password (if you inject the wrong secret, it will still ask). And there you have it: you’ve booted up a confidential VM from an encrypted image file. If you’re like me, you’ll also want to fire up gdb on the qemu process just to show that the entire memory of the VM is encrypted …

Conclusions and Caveats

The above script should allow you to boot an encrypted VM anywhere: locally or in the cloud, provided you can access the QMP port (most clouds use libvirt which introduces yet another additional layering pain). The biggest drawback, if you refer to the diagram, is the yellow box: you must trust the secret element, which in both Intel and AMD is proprietary4, in order to get confidential computing to work. Although there is hope that in future the secret element could be fully open source, it isn’t today.

The next annoyance is that launching a confidential VM is high touch requiring collaboration from both the guest owner and the host owner (due to the anti-replay nonce). For a single launch, this is a minor annoyance but for an autoscaling (launch VMs as needed) platform it becomes a major headache. The solution seems to be to have some Hardware Security Module (HSM), like the cloud uses today to store encryption keys securely, and have it understand how to measure and launch encrypted VMs on behalf of the guest owner.

The final conclusion to remember is that confidentiality is not security: your VM is as exploitable inside a confidential encrypted VM as it was outside. In many ways confidentiality and security are opposites, in that security in part requires reducing the trusted code and confidentiality requires pulling as much as possible inside. Confidential VMs do have an answer to the Cloud trust problem since the enterprise can now deploy VMs without fear of tampering by the cloud provider, but those VMs are as insecure in the cloud as they were in the Enterprise Data Centre. All of this argues that Confidential Computing, while an important milestone, is only one step on the journey to cloud security.

Patch Status

The OVMF patches are upstream (including modifications requested by Intel for TDX). The QEMU and grub patch sets are still on the lists.

Building Encrypted Images for Confidential Computing

With both Intel and AMD announcing confidential computing features to run encrypted virtual machines, IBM research has been looking into a new format for encrypted VM images. The first question is why a new format, after all qcow2 only recently deprecated its old encrypted image format in favour of luks. The problem is that in confidential computing, the guest VM runs inside the secure envelope but the host hypervisor (including the QEMU process) is untrusted and thus runs outside the secure envelope and, unfortunately, even for the new luks format, the encryption of the image is handled by QEMU and so the encryption key would be outside the secure envelope. Thus, a new format is needed to keep the encryption key (and, indeed, the encryption mechanism) within the guest VM itself. Fortunately, encrypted boot of Linux systems has been around for a while, and this can be used as a practical template for constructing a fully confidential encrypted image format and maintaining that confidentiality within a hostile cloud environment. In this article, I’ll explore the state of the art in encrypted boot, constructing EFI encrypted boot images, and finally, in the follow on article, look at deploying an encrypted image into a confidential environment and maintaining key secrecy in the cloud.

Encrypted Boot State of the Art

Luks and the cryptsetup toolkit have been around for a while and recently (in 2018), the luks format was updated to version 2. However, actually booting a linux kernel from an encrypted partition has always been a bit of a systems problem, primarily because the bootloader (grub) must decrypt the partition to actually load the kernel. Fortunately, grub can do this, but unfortunately the current grub in most distributions (2.04) can only read the version 1 luks format. Secondly, the user must type the decryption passphrase into grub (so it can pull the kernel and initial ramdisk out of the encrypted partition to boot them), but grub currently has no mechanism to pass it on to the initial ramdisk for mounting root, meaning that either the user has to type their passphrase twice (annoying) or the initial ramdisk itself has to contain a file with the disk passphrase. This latter is the most commonly used approach and only has minor security implications when the system is in motion (the ramdisk and the key file must be root read only) and the password is protected at rest by the fact that the initial ramdisk is also on the encrypted volume. Even more annoying is the fact that there is no distribution standard way of creating the initial ramdisk. Debian (and Ubuntu) have the most comprehensive documentation on how to do this, so the next section will look at the much less well documented systemd/dracut mechanism.

Encrypted Boot for Systemd/Dracut

Part of the problem here seems to be less that stellar systems co-ordination between the two components. Additionally, the way systemd supports passphraseless encrypted volumes has been evolving for a while but changed again in v246 to mirror the Debian method. Since cloud images are usually pretty up to date, I’ll describe this new way. Each encrypted volume is referred to by UUID (which will be the UUID of the containing partition returned by blkid). To get dracut to boot from an encrypted partition, you must pass in

rd.luks.uuid=<UUID>

but you must also have a key file named

/etc/cryptsetup-keys.d/luks-<UUID>.key

And, since dracut hasn’t yet caught up with this, you usually need a cryptodisk.conf file in /etc/dracut.conf.d/ which contains

install_items+=" /etc/cryptsetup-keys.d/* "

Grub and EFI Booting Encrypted Images

Traditionally grub is actually installed into the disk master boot record, but for EFI boot that changed and the disk (or VM image) must have an EFI System partition which is where the grub.efi binary is installed. Part of the job of the grub.efi binary is to find the root partition and source the /boot/grub1/grub.cfg. When you install grub on an EFI partition a search for the root by UUID is actually embedded into the grub binary. Another problem is likely that your distribution customizes the location of grub and updates the boot variables to tell the system where it is. However, a cloud image can’t rely on the boot variables and must be installed in the default location (\EFI\BOOT\bootx64.efi). This default location can be achieved by adding the –removable flag to grub-install.

For encrypted boot, this becomes harder because the grub in the EFI partition must set up the cryptographic location by UUID. However, if you add

GRUB_ENABLE_CRYPTODISK=y

To /etc/default/grub it will do the necessary in grub-install and grub-mkconfig. Note that on Fedora, where every other GRUB_ENABLE parameter is true/false, this must be ‘y’, unfortunately grub-install will look for =y not =true.

Putting it all together: Encrypted VM Images

Start by extracting the root of an existing VM image to a tar file. Make sure it has all the tools you will need, like cryptodisk and grub-efi. Create a two partition raw image file and loopback mount it (I usually like 4GB) with a small efi partition (p1) and an encrypted root (p2):

truncate -s 4GB disk.img
parted disk.img mklabel gpt
parted disk.img mkpart primary 1Mib 100Mib
parted disk.img mkpart primary 100Mib 100%
parted disk.img set 1 esp on
parted disk.img set 1 boot on

Now setup the efi and cryptosystem (I use ext4, but it’s not required). Note at this time luks will require a password. Use a simple one and change it later. Also note that most encrypted boot documents advise filling the encrypted partition with random numbers. I don’t do this because the additional security afforded is small compared with the advantage of converting the raw image to a smaller qcow2 one.

losetup -P -f disk.img          # assuming here it uses loop0
l=($(losetup -l|grep disk.img)) # verify with losetup -l
mkfs.vfat ${l}p1
blkid ${l}p1       # remember the EFI partition UUID
cryptsetup --type luks1 luksFormat ${l}p2 # choose temp password
blkid ${l}p2       # remember this as <UUID> you'll need it later 
cryptsetup luksOpen ${l}p2 cr_root
mkfs.ext4 /dev/mapper/cr_root
mount /dev/mapper/cr_root /mnt
tar -C /mnt -xpf <vm root tar file>
for m in run sys proc dev; do mount --bind /$m /mnt/$m; done
chroot /mnt

Create or modify /etc/fstab to have root as /dev/disk/cr_root and the EFI partition by label under /boot/efi. Now set up grub for encrypted boot2

echo "GRUB_ENABLE_CRYPTODISK=y" >> /etc/default/grub
mount /boot/efi
grub-install --removable --target=x86_64-efi
grub-mkconfig -o /boot/grub/grub.cfg

For Debian, you’ll need to add an /etc/crypttab entry for the encrypted disk:

cr_root UUID=<uuid> luks none

And then re-create the initial ramdisk. For dracut systems, you’ll have to modify /etc/default/grub so the GRUB_CMDLINE_LINUX has a rd.luks.uuid=<UUID> entry. If this is a selinux based distribution, you may also have to trigger a relabel.

Now would also be a good time to make sure you have a root password you know or to install /root/.ssh/authorized_keys. You should unmount all the binds and /mnt and try EFI booting the image. You’ll still have to type the password a couple of times, but once the image boots you’re operating inside the encrypted envelope. All that remains is to create a fast boot high entropy low iteration password and replace the existing one with it and set the initial ramdisk to use it. This example assumes your image is mounted as SCSI disk sda, but it may be a virtual disk or some other device.

dd if=/dev/urandom bs=1 count=33|base64 -w 0 > /etc/cryptsetup-keys.d/luks-<UUID>.key
chmod 600 /etc/cryptsetup-keys.d/luks-<UUID>.key
cryptsetup --key-slot 1 luksAddKey /dev/sda2 # permanent recovery key
cryptsetup --key-slot 0 luksRemoveKey /dev/sda2 # remove temporary
cryptsetup --key-slot 0 --iter-time 1 luksAddKey /dev/sda2 /etc/cryptsetup-keys.d/luks-<UUID>.key

Note the “-w 0” is necessary to prevent the password from having a trailing newline which will make it difficult to use. For mkinitramfs systems, you’ll now need to modify the /etc/crypttab entry

cr_root UUID=<UUID> /etc/cryptsetup-keys.d/luks-<UUID>.key luks

For dracut you need the key install hook in /etc/dracut.conf.d as described above and for Debian you need the keyfile pattern:

echo "KEYFILE_PATTERN=\"/etc/cryptsetup-keys.d/*\"" >>/etc/cryptsetup-initramfs/conf-hook

You now rebuild the initial ramdisk and you should now be able to boot the cryptosystem using either the high entropy password or your rescue one and it should only prompt in grub and shouldn’t prompt again. This image file is now ready to be used for confidential computing.

Creating a Home IPv6 Network

One of the recent experiences of Linux Plumbers Conference convinced me that if you want to be part of a true open source WebRTC based peer to peer audio/video interaction, you need an internet address that’s not behind a NAT. In reality, the protocol still works as long as you can contact a stun server to tell you what your external address is and possibly a turn server to proxy the packets if both endpoints are NATed but all this seeking external servers takes time as those of you who complained about the echo test found. The solution to all this is to connect over IPv6 which has an address space large enough to support every device on the planet having its own address. All modern Linux distributions support IPv6 out of the box so the chances are you’ve actually accidentally used it without ever noticing, which is one of the beauties of IPv6 autoconfiguration (it’s supposed to just work).

However, I recently moved, and so lost my fibre internet connection to cable but cable that did come with an IPv6 address, so this is my story of getting it all to work. If you don’t really care about the protocol basics, you can skip down to the how. This guide is also focussed on a “dual stack” configuration (one that has both IPv6 and IPv4 addresses). Pure IPv6 configurations are possible, but because some parts of the internet are still IPv4 only, they’re not complete unless you set up an IPv4 encapsulating bridge.

The Basics of IPv6

IPv6 has been a mature protocol for a long time now, so I erroneously assumed there’d be a load of good HOWTOs about it. However, after reading 20 different descriptions of how the IPv6 128 bit address space works and not much else, I gave up in despair and read the RFCs instead. I’ll assume you’ve read at least one of these HOWTOS, so I don’t have to go into IPv6 address prefixes, suffixes, interface IDs or subnets so I’ll begin where most of the HOWTOs end.

How does IPv6 Just Work?

In IPv4 there’s a protocol called dynamic host configuration protocol (DHCP) so as long as you can find a DHCP server you can get all the information you need to connect (local address, router, DNS server, time server, etc). However, this service has to be set up by someone and IPv6 is designed to configure a network without it.

The first assumption IPv6 StateLess Address AutoConfiguration (SLAAC) makes is that it’s on a /64 subnet (So every subnet in IPv6 contains 1010 times as many addresses as the entire IPv4 internet). This means that, since most real subnets contain <100 systems, they can simply choose a random address and be very unlikely to clash with the existing systems. In fact, there are three current ways of choosing an address in the /64:

  1. EUI-64 (RFC 4291) based on the MAC address which is basically the MAC with one bit flipped and ff:fe placed in the middle.
  2. Stable Private (RFC 7217) which generate from a hash based on a static key, interface, prefix and a counter (the counter is incremented if there is a clash). These are preferred to the EUI-64 ones which give away any configuration associated with the MAC address (such as what type of network card you have)
  3. Privacy Extension Addresses (RFC 4941) which are very similar to stable private addresses except they change over time using the IPv6 address deprecation mechanism and are for client systems who want to preserve anonymity.

The next problem in Linux is who configures the interface? The Kernel IPv6 stack is actually designed to do it, and will unless told not to, but most of the modern network controllers (like NetworkManager) are control freaks and turn off the kernel’s auto configuration so they can do it themselves. They also default to stable private addressing using a static secret maintained in the filesystem (/var/lib/NetworkManager/secret_key).

The next thing to understand about IPv6 addresses is that they are divided into scopes, the most important being link local (unrouteable) addresses which conventionally always have the prefix fe80::/64. The link local address is configured first using one of the above methods and then used to probe the network.

Multicast and Neighbour Discovery

Unlike IPv4, IPv6 has no broadcast capability so all discovery is done by multicast. Nodes coming up on the network subscribe to particular multicast addresses, via special packets intercepted by the switch, and won’t receive any multicast to which they’re not subscribed. Conventionally, all link local multicast addresses have the prefix ff02::/64 (for other types of multicast address see RFC 4291). All nodes subscribe to the “all nodes” multicast address ff02::1 and also must subscribe to their own solicited node multicast address at ff02::1:ffXX:XXXX where the last 24 bits correspond to the lowest 24 bits of the node’s IPv6 address. This latter is to avoid the disruption that used to occur in IPv4 from ARP broadcasts because now you can target a specific subset of nodes for address resolution.

The IPV6 address resolution protocol is called Neighbour Solicitation (NS), described in RFC 4861 and it’s use with SLAAC described in RFC 4862, and is done by sending a multicast to the neighbor solicitation address of the node you want to discover containing the full IPv6 address you want to know, a node with the matching address replies with its link layer (MAC) address in a Neighbour Advertisement (NA) packet.

Once a node has chosen its link local address, it first sends out a NS packet to its chosen address to see if anyone replies and if no-one does it assumes it is OK to keep it otherwise it follows the collision avoidance protocol associated with its particular form of address. Once it has found a unique address, the node configures this link local address and looks for a router. Note that if an IPv6 network isn’t present, discovery stops here, which is why most network interfaces always show a link local IPv6 address.

Router Discovery

Once the node has its own unique link local address, it uses it to send out Router Solicitation (RS) packets to the “all routers” multicast address ff02::2. Every router on the network responds with a Router Advertisement (RA) packet which describes (among other things) the the router lifetime, the network MTU, a set of one or more prefixes the router is responsible for, the router’s link address and a set of option flags including the M (Managed) and O (Other Configuration) flag and possibly a set of DNS servers.

Each advertised prefix contains the prefix and prefix length, a set of flags including the A (autonomous configuration) and L (link local) and a set of lifetimes. The Link Local prefixes tell you what global prefixes the local network users (there may be more than one) and whether you are allowed to do SLAAC on the global prefix (if the A flag is clear, you must ask the router for an address using DHCPv6). If the router has a non zero lifetime, you may assume it is a default router for the subnet.

Now that the node has discovered one or more routers it may configure its own global address (note that every IPv6 routeable node has at least two addresses: a link local and a global). How it does this depends on the router and prefix flags

Global Address Configuration

The first thing a node needs to know is whether to use SLAAC for the global address or DHCPv6. This is entirely determined by the A flag of any link local prefix in the RA packet. If A is set, then the node may use SLAAC and if A is clear then the node must use DHCPv6 to obtain an address. If A is set and also the M (Managed) flag then the node may use either SLAAC or DHCPv6 (or both) to obtain an address and if the M flag is clear, but the O (Other Config) flag is present then the node must use SLAAC but may use DHCPv6 to obtain other information about the network (usually DNS).

Once the node has a global address in now needs a default route. It forms the default route list from the RA packets that have a non-zero router Lifetime. All of these are configured as default routes to their link local address with the RA specified hop count. Finally, the node may add specific prefix routes from RA packets with zero router LifeTimes but non link local prefixes.

DHCPv6 is a fairly complex configuration protocol (see RFC 8415) but it cannot specify either prefix length (meaning all obtained addresses are configured as /128) or routes (these must be obtained from RA packets). This leads to a subtlety of outbound address selection in that the most specific is always preferred, so if you configure both by SLAAC and DHCPv6, the SLAAC address will be added as /64 and the DHCPv6 address as /128 meaning your outbound IP address will always be the DHCPv6 one (although if an external entity knows your SLAAC address, they will still be able to reach you on it).

The How: Configuring your own Home Router

One of the things you’d think from the above is that IPv6 always auto configures and, while it is true that if you simply plug your laptop into the ethernet port of a cable modem it will just automatically configure, most people have a more complex home setup involving a router, which needs some special coaxing before it will work. That means you need to obtain additional features from your ISP using special DHCPv6 requests.

This section is written from my own point of view: I have a rather complex IPv4 network which has a completely open but bandwidth limited (to untrusted clients) wifi network, and several protected internal networks (one for my lab, one for my phones and one for the household video cameras), so I need at least 4 subnets to give every device in my home an IPv6 address. I also use OpenWRT as my router distribution, so all the IPv6 configuration information is highly specific to it (although it should be noted that things like NetworkManager can also do all of this if you’re prepared to dig in the documentation).

Prefix Delegation

Since DHCPv6 only hands out a /128 address, this isn’t sufficient because it’s the IP address of the router itself. In order to become a router, you must request delegation of part of the IPv6 address space via the Identity Association for Prefix Delection (IA_PD) option of DHCPv6. Once this is done the router IP address will be assumed by the ISP to be the route for all of the delegated prefixes. The subtlety here is that if you want more than one subnet, you have to ask for it specifically (the client must specify the exact prefix length it’s looking for) and since it’s a prefix length, and your default subnet should be /64, if you request a prefix length of 64 you only have one subnet. If you request 63 you have 2 and so on. The problem is how do you know how many subnets the ISP is willing to give you? Unfortunately there’s no way of finding this (I had to do an internet search to discover my ISP, Comcast, was willing to delegate a prefix length of 60, meaning 16 subnets). If searching doesn’t tell you how much your ISP is willing to delegate, you could try starting at 48 and working your way to 64 in increments of 1 to see what the largest delegation you can get away with (There have been reports of ISPs locking you at your first delegated prefix length, so don’t start at 64). The final subtlety is that the prefix you’re delegated may not be the same prefix as the address your router obtained (my current comcast configuration has my router at 2001:558:600a:… but my delegated prefix is 2601:600:8280:66d0:/60). Note you can run odhcp6c manually with the -P option if you have to probe your ISP to find out what size of prefix you can get.

Configuring the Router for Prefix Delegation

In OpenWRT terms, the router WAN DHCP(v6) configuration is controlled by /etc/default/network. You’ll already have a WAN interface (likely called ‘wan’) for DHCPv4, so you simply add an additional ‘wan6’ interface to get an additional IPv6 and become dual stack. In my configuration this looks like

config interface 'wan6'
        option ifname '@wan'
        option proto 'dhcpv6'
        option reqprefix 60

The slight oddity is the ifname: @wan simply tells the config to use the same ifname as the ‘wan’ interface. Naming it this way is essential if your wan is a bridge, but it’s good practice anyway. The other option ‘reqprefix’ tells DHCPv6 to request a /60 prefix delegation.

Handing Out Delegated Prefixes

This turns out to be remarkably simple. Firstly you have to assign a delegated prefix to each of your other interfaces on the router, but you can do this without adding a new OpenWRT interface for each of them. My internal IPv4 network has all static addresses, so you add three directives to each of the interfaces:

config interface 'lan'
        ... interface designation (bridge for me)
        option proto 'static'
        ... ipv4 addresses
        option ip6assign '64'
        option ip6hint '1'
        option ip6ifaceid '::ff'

ip6assign ‘N’ means you are a /N network (so this is always /64 for me) and ip6hint ‘N’ means use N as your subnet id and ip6ifaceid ‘S’ means use S as the IPv6 suffix (This defaults to ::1 so if you’re OK with that, omit this directive). So given I have a 2601:600:8280:66d0::/60 prefix, the global address of this interface will be 2601:600:8280:66d1::ff. Now the acid test, if you got this right, this global address should be pingable from anywhere on the IPv6 internet (if it isn’t, it’s likely a firewall issue, see below).

Advertising as a Router

Simply getting delegated a delegated prefix on a local router interface is insufficient . Now you need to get your router to respond to Router Solicitations on ff02::2 and optionally do DHCPv6. Unfortunately, OpenWRT has two mechanisms for doing this, usually both installed: odhcpd and dnsmasq. What I found was that none of my directives in /etc/config/dhcp would take effect until I disabled odhcpd completely

/etc/init.d/odhcpd stop
/etc/init.d/odhcpd disable

and since I use dnsmasq extensively elsewhere (split DNS for internal/external networks), that suited me fine. I’ll describe firstly what options you need in dnsmasq and secondly how you can achieve this using entries in the OpenWRT /etc/config/dhcp file (I find this useful because it’s always wise to check what OpenWRT has put in the /var/etc/dnsmasq.conf file).

The first dnsmasq option you need is ‘enable-ra’ which is a global parameter instructing dnsmasq to handle router advertisements. The next parameter you need is the per-interface ‘ra-param’ which specifies the global router advertisement parameters and must appear once for every interface you want to advertise on. Finally the ‘dhcp-range’ option allows more detailed configuration of the type of RA flags and optional DHCPv6.

SLAAC or DHCPv6 (or both)

In many ways this is a matter of personal choice. If you allow SLAAC, hosts which want to use privacy extension addresses (like Android phones) can do so, which is a good thing. If you also allow DHCPv6 address selection you will have a list of addresses assigned to hosts and dnsmasq will do DNS resolution for them (although it can do DNS for SLAAC addresses provided it gets told about them). A special tag ‘constructor’ exists for the ‘dhcp-range’ option which tells it to construct the supplied address (for either RA or DHCPv6) from the IPv6 global prefix of the specified interface, which is how you pass out our delegated prefix addresses. The modes for ‘dhcp-range’ are ‘ra-only’ to disallow DHCPv6 entirely, ‘slaac’ to allow DHCPv6 address selection and ‘ra-stateless’ to disallow DHCPv6 address selection but allow other DHCPv6 configuration information.

Based on trial and error (and finally examining the scripting in /etc/init.d/dnsmasq) the OpenWRT options required to achieve the above dnsmasq options are

config dhcp lan
        option interface lan
        option start 100
        option limit 150
        option leasetime 1h
        option dhcpv6 'server'
        option ra_management '1'
        option ra 'server'

with ‘ra_management’ as the key option with ‘0’ meaning SLAAC with DHCPv6 options, ‘1’ meaning SLAAC with full DHCPv6, ‘2’ meaning DHCPv6 only and ‘3’ meaning SLAAC only. Another OpenWRT oddity is that there doesn’t seem to be a way of setting the lease range: it always defaults to either static only or ::1000 to ::ffff.

Firewall Configuration

One of the things that trips people up is the fact that Linux has two completely separate firewalls: one for IPv4 and one for IPv6. If you’ve ever written any custom rules for them, the chances are you did it in the OpenWRT /etc/firewall.user file and you used the iptables command, which means you only added the rules to the IPv4 firewall. To add the same rule for IPv6 you need to duplicate it using the ip6tables command. Another significant problem, if you’re using a connection tracking for port knock detection like I am, is that Linux connection tracking has difficulty with IPv6 multicast, so packets that go out to a multicast but come back as unicast (as most of the discovery protocols do) get the wrong conntrack state. To fix this, I eventually had to have an INPUT rule just accepting all ICMPv6 and DHCPv6 (udp ports 546 [client] and 547 [server]). The other firewall considerations are that now everyone has their own IP address, there’s no need to NAT (OpenWRT can be persuaded to take care of this automatically, but if you’re duplicating the IPv4 rules manually, don’t duplicate the NAT rules). The final one is likely more applicable to me: my wifi interface is designed to be an extension of the local internet and all machines connecting to it are expected to be able to protect themselves since they’ll migrate to such hostile environments as airport wifi, thus I do complete exposure of wifi connected devices to the general internet for all ports, including port probes. For my internal devices, I have a RELATED,ESTABLISHED rule to make sure they’re not probed since they’re not designed to migrate off the internal network.

Now the problems with OpenWRT: since you want NAT on IPv4 but not on IPv6 you have to have two separate wan zones for them: if you try to combine them (as I first did), then OpenWRT will add an IPv6 –ctstate INVALID rule which will prevent Neighbour Discovery from working because of the conntrack issues with IPv6 multicast, so my wan zones are (well, this is a lie because my firewall is now hand crafted, but this is what I checked worked before I put the hand crafted firewall in place):

config zone
        option name 'wan'
        option network 'wan'
        option masq '1'
        ...

config zone
        option name 'wan6'
        option network 'wan6'
        ...

And the routing rules for the lan zone (fully accessible) are

config forwarding
        option src 'lan'
        option dest 'wan'

config forwarding
        option src 'lan'
        option dest 'wan6'

config forwarding
        option src 'wan6'
        option dest 'lan'

Putting it Together: Getting the Clients IPv6 Connected

Now that you have your router configured, everything should just work. If it did, your laptop wifi interface should now have a global IPv6 address

ip -6 address show dev wlan0

If that comes back empty, you need to enable IPv6 on your distribution. If it has only a link local (fe80:: prefix) address, IPv6 is enabled but your router isn’t advertising (suspect firewall issues with discovery packets or dnsmasq misconfiguration). If you see a global address, you’re done. Now you should be able to go to https://testv6.com and secure a 10/10 score.

The final piece of the puzzle is preferring your new IPv6 connection when DNS offers a choice of IPv4 or IPv6 addresses. All modern Linux clients should prefer IPv6 when available if connected to a dual stack network, so try … if you ping, say, www.google.com and see an IPv6 address you’re done. If not, you need to get into the murky world of IPv6 address labelling (RFC 6724) and gai.conf.

Conclusion

Adding IPv6 to and existing IPv4 setup is currently not a simple plug in and go operation. However, provided you understand a handful of differences between the two protocols, it’s not an insurmountable problem either. I have also glossed over many of the problems you might encounter with your ISP. Some people have reported that their ISPs only hand out one IPv6 address with no prefix delegation, in which case I think finding a new ISP would be wisest. Others report that the ISP will only delegate one /64 prefix so your choice here is either to only run one subnet (likely sufficient for a lot of home networks), or subnet at greater than /64 and forbid SLAAC, which is definitely not a recommended configuration. However, provided your ISP is reasonable, this blog post should at least help get you started.

Lessons from the GNOME Patent Troll Incident

First, for all the lawyers who are eager to see the Settlement Agreement, here it is. The reason I can do this is that I’ve released software under an OSI approved licence, so I’m covered by the Releases and thus entitled to a copy of the agreement under section 10, but I’m not a party to any of the Covenants so I’m not forbidden from disclosing it.

Analysis of the attack

The Rothschild Modus Operandi is to obtain a fairly bogus patent (in this case, patent 9,936,086), form a limited liability corporation (LLC) that only holds the one patent and then sue a load of companies with vaguely related businesses for infringement. A key element of the attack is to offer a settlement licensing the patent for a sum less than it would cost even to mount an initial defence (usually around US$50k), which is how the Troll makes money: since the cost to file is fairly low, as long as there’s no court appearance, the amount gained is close to US$50k if the target accepts the settlement offer and, since most targets know how much any defence of the patent would cost, they do.

One of the problems for the target is that once the patent is issued by the USPTO, the court must presume it is valid, so any defence that impugns the validity of the patent can’t be decided at summary judgment. In the GNOME case, the sued project, shotwell, predated the filing of the patent by several years, so it should be obvious that even if shotwell did infringe the patent, it would have been prior art which should have prevented the issuing of the patent in the first place. Unfortunately such an obvious problem can’t be used to get the case tossed on summary judgement because it impugns the validity of the patent. Put simply, once the USPTO issues a patent it’s pretty much impossible to defend against accusations of infringement without an expensive trial which makes the settlement for small sums look very tempting.

If the target puts up any sort of fight, Rothschild, knowing the lack of merits to the case, will usually reduce the amount offered for settlement or, in extreme cases, simply drop the lawsuit. The last line of defence is the LLC. If the target finds some way to win damages (as ADS did in 2017) , the only thing on the hook is the LLC with the limited liability shielding Rothschild personally.

How it Played out Against GNOME

This description is somewhat brief, for a more in-depth description see the Medium article by Amanda Brock and Matt Berkowitz.

Rothschild performed the initial attack under the LLC RPI (Rothschild Patent Imaging). GNOME was fortunate enough to receive an offer of Pro Bono representation from Shearman and Sterling and immediately launched a defence fund (expecting that the cost of at least getting into court would be around US$200k, even with pro bono representation). One of its first actions, besides defending the claim was to launch a counterclaim against RPI alleging exceptional practices in bringing the claim. This serves two purposes: firstly, RPI can’t now simply decide to drop the lawsuit, because the counterclaim survives and secondly, by alleging potential misconduct it seeks to pierce the LLC liability shield. GNOME also decided to try to obtain as much as it could for the whole of open source in the settlement.

As it became clear to Rothschild that GNOME wouldn’t just pay up and they would create a potential liability problem in court, the offers of settlement came thick and fast culminating in an offer of a free licence and each side would pay their own costs. However GNOME persisted with the counter claim and insisted they could settle for nothing less than the elimination of the Rothschild patent threat from all of open source. The ultimate agreement reached, as you can read, does just that: gives a perpetual covenant not to sue any project under an OSI approved open source licence for any patent naming Leigh Rothschild as the inventor (i.e. the settlement terms go far beyond the initial patent claim and effectively free all of open source from any future litigation by Rothschild).

Analysis of the Agreement

Although the agreement achieves its aim, to rid all of Open Source of the Rothschild menace, it also contains several clauses which are suboptimal, but which had to be included to get a speedy resolution. In particular, Clause 10 forbids the GNOME foundation or its affiliates from publishing the agreement, which has caused much angst in open source circles about how watertight the agreement actually was. Secondly Clause 11 prohibits GNOME or its affiliates from pursuing any further invalidity challenges to any Rothschild patents leaving Rothschild free to pursue any non open source targets.

Fortunately the effect of clause 10 is now mitigated by me publishing the agreement and the effect of clause 11 by the fact that the Open Invention Network is now pursuing IPR invalidity actions against the Rothschild patents.

Lessons for the Future

The big lesson is that Troll based attacks are a growing threat to the Open Source movement. Even though the Rothschild source may have been neutralized, others may be tempted to follow his MO, so all open source projects have to be prepared for a troll attack.

The first lesson should necessarily be that if you’re in receipt of a Troll attack, tell everyone. As an open source organization you’re not going to be able to settle and you won’t get either pro bono representation or the funds to fight the action unless people know about it.

The second lesson is that the community will rally, especially with financial aid, if you put out a call for help (and remember, you may be looking at legal bills in the six figure range).

The third lesson is always file a counter claim to give you significant leverage over the Troll in settlement negotiations.

And the fourth lesson is always refuse to settle for nothing less than neutralization of the threat to the entirety of open source.

Conclusion

While the lessons above should work if another Rothschild like Troll comes along, it’s by no means guaranteed and the fact that Open Source project don’t have the funding to defend themselves (even if they could raise it from the community) makes them look vulnerable. One thing the entire community could do to mitigate this problem is set up a community defence fund. We did this once before 16 years ago when SCO was threatening to sue Linux users and we could do it again. Knowing there was a deep pot to draw on would certainly make any Rothschild like Troll think twice about the vulnerability of an Open Source project, and may even deter the usual NPE type troll with more resources and better crafted patents.

Finally, it should be noted that this episode demonstrates how broken the patent system still is. The key element Rothschild like trolls require is the presumption of validity of a granted patent. In theory, in the light of the Alice decision, the USPTO should never have granted the patent but it did and once that happened the troll targets have no option than either to pay up the smaller sum requested or expend a larger sum on fighting in court. Perhaps if the USPTO can’t stop the issuing of bogus patents it’s time to remove the presumption of their validity in court … or at least provide some sort of prima facia invalidity test to apply at summary judgment (like the project is older than the patent, perhaps).

A Roadmap for Eliminating Patents in Open Source

The realm of Software Patents is often considered to be a fairly new field which isn’t really influenced by anything else that goes on in the legal lansdcape. In particular there’s a very old field of patent law called exhaustion which had, up until a few years ago, never been applied to software patents. This lack of application means that exhaustion is rarely raised as a defence against infringement and thus it is regarded as an untested strategy. Van Lindberg recently did a FOSDEM presentation containing interesting ideas about how exhaustion might apply to software patents in the light of recent court decisions. The intriguing possibility this offers us is that we may be close to an enforceable court decision (at least in the US) that would render all patents in open source owned by community members exhausted and thus unenforceable. The purpose of this blog post is to explain the current landscape and how we might be able to get the necessary missing court decisions to make this hope a reality.

What is Patent Exhaustion?

Patent law is ancient, going back to Greece in around 500BC. However, every legal system has been concerned that patent holders, being an effective monopoly with the legal right to exclude others, did not abuse that monopoly position. This lead to the concept that if you used your monopoly power to profit, you should only be able to do it once for the same item so that absolute property rights couldn’t be clouded by patents. This leads to something called the exhaustion doctrine: so if Alice holds a patent on some item which she sells to Bob and Bob later sells the same item to Charlie, Alice can’t force Bob or Charlie to give her a part of their sale proceeds in exchange for her allowing Charlie to practise the patent on the item. The patent rights are said to be exhausted with the sale from Alice to Bob, so there are no patent rights left to enforce on Charlie. The exhaustion doctrine has since been expanded to any authorized transfer, even if no money changes hands (so if Alice simply gave Bob the item instead of selling it, the patent still exhausts at that transaction and Bob is still free to give or sell the item to Charlie without interference from Alice).

Of course, modern US patent rights have been around now for two centuries and in that time manufacturers have tried many ingenious schemes to get around the exhaustion doctrine profitably, all of which have so far failed in the courts, leading to quite a wealth of case law on the subject. The most interesting recent example (Lexmark v Impression) was over whether a patent holder could use their patent power to enforce any onward conditions at all for which the US Supreme Court came to the conclusive finding: they can’t and goes on to say that all patent rights in the item terminate in the first authorized transfer. That doesn’t mean no post sale conditions can be imposed, they can by contract or licence or other means, it just means post sale conditions can’t be enforced by patent actions. This is the bind for Lexmark: their sales contracts did specify that empty cartridges couldn’t be resold, so their customers violated that contract by selling the cartridges to Impression to refill and resell. However, that contract was between Lexmark and the customer not Lexmark and Impression, so absent patent remedies Lexmark has no contractual case against Impression, only against its own customers.

Can Exhaustion apply if Software isn’t actually sold?

The exhaustion doctrine actually has an almost identical equivalent for copyright called the First Sale doctrine. Back when software was being commercialized, no software distributor liked the idea that copyright in software exhausts after it is sold, so the idea of licensing instead of selling software was born, which is why you always get that end user licence agreement for software you think you bought. However, this makes all software (including open source) a very tricky for patent exhaustion because there’s no first sale to exhaust the rights.

The idea that Exhaustion didn’t have to involve an exchange of something (so became authorized transfer instead of first sale) in US law is comparatively recent, dating to a 2013 decision LifeScan v Shasta where one point won on appeal was that giving away devices did exhaust the patent. The idea that authorized transfer could extend to software downloads really dates to Cascades v Samsung in 2014.

The bottom line is that exhaustion does apply to software and downloading is an authorized transfer within the meaning of the Exhaustion Doctrine.

The Implications of Lexmark v Impression for Open Source

The precedent for Open Source is quite clear: Patents cannot be used to impose onward conditions that the copyright licence doesn’t. For instance the Open Air Interface 5G alliance public licence attempts just such a restriction in clause 3 “Grant of Patent License” where it tries to restrict the grant to being only if you use the source for “study and research” otherwise you need an additional patent licence from OAI. Lexmark v Impressions makes that clause invalid in the licence: once you obtain open source under the OAI licence, the OAI patents exhaust at that point and there are no onward patent rights left to enforce. This means that source distributed under OAI can be reused under the terms of the copyright licence (which is permissive) without any fear of patent restrictions. Now OAI can still amend its copyright licence to impose the field of use restrictions and enforce them via copyright means, it just can’t use patents to do so.

FRAND and Open Source

There have recently been several attempts to claim that FRAND patent enforcement and Open Source licensing can be compatible, or more specifically a FRAND patent pool holder like a Standards Development Organization can both produce an Open Source reference implementation and still collect patent Royalties. This looks to be wrong, however; the Supreme Court decision is clear: once a FRAND Patent pool holder distributes any code, that distribution is an authorized transfer within the meaning of the first sale doctrine and all FRAND pool patents exhaust at that point. The only way to enforce the FRAND royalty payments after this would be in the copyright licence of the code and obviously such a copyright licence, while legal, would not be remotely an Open Source licence.

Exhausting Patents By Distribution

The next question to address is could patents become exhausted simply because the holder distributed Open Source code in any form? As I said before, there is actually a case on point for this as well: Cascades v Samsung. In this case, Cascades tried to sue Samsung for violating a patent on the Dalvik JIT engine in AOSP. Cascades claimed they had licensed the patent to Google for a payment only for use in Google products. Samsung claimed exhaustion because Cascades had licensed the patent to Google and Samsung downloaded AOSP from Google. The court agreed with this and dismissed the infringement action. Case closed, right? Not so fast: it turns out Cascades raised a rather silly defence to Samsung’s claim of exhaustion, namely that the authorized transfer under the exhaustion doctrine didn’t happen until Samsung did the download from Google, so they were still entitled to enforce the Google products only restriction. As I said in the beginning courts have centuries of history with manufacturers trying to get around the exhaustion doctrine and this one crashed and burned just like all the others. However, the question remains: if Cascades had raised a better defence to the exhaustion claim, would they have prevailed?

The defence Cascades could have raised is that Samsung didn’t just download code from Google, they also copied the code they downloaded and those copies should be covered under the patent right to exclude manufacture, which didn’t exhaust with the download. To illustrate this in the Alice, Bob, Charlie chain: Alice sells an item to Bob and thus exhausts the patent so Bob can sell it on to Charlie unencumbered. However that exhaustion does not give either Bob or Charlie the right to manufacture a new copy of the item and sell it to Denise because exhaustion only applies to the same item Alice sold, not to a newly manufactured copy of that item.

The copy as new manufacture defence still seems rather vulnerable on two grounds: first because Samsung could download any number of exhausted copies from Google, so what’s the difference between them downloading ten copies and them downloading one copy and then copying it themselves nine times. Secondly, and more importantly, Cascades already had a remedy in copyright law: their patent licence to Google could have required that the AOSP copyright licence be amended not to allow copying of the source code by non-Google entities except on payment of royalties to Cascades. The fact that Cascades did not avail themselves of this remedy at the time means they’re barred from reclaiming it now via patent action.

The bottom line is that distribution exhausts all patents reading on the code you distribute is a very reasonable defence to maintain in a patent infringement lawsuit and it’s one we should be using much more often.

Exhaustion by Contribution

This is much more controversial and currently has no supporting case law. The idea is that Distribution can occur even with only incremental updates on the existing base (git pull to update code, say), so if delta updates constitute an authorized transfer under the exhaustion doctrine, then so must a patch based contribution, being a delta update from a contributor to the project, be an authorized transfer. In which case all patents which read on the project at the time of contribution must also exhaust when the contribution is made.

Even if the above doesn’t fly, it’s undeniable most contributions today are made by cloning a git tree and republishing it plus your own updates (essentially a github fork) which makes you a bona fide distributor of the whole project because it can all be downloaded from your cloned tree. Thus I think it’s reasonable to hold that all patents owned by distributors and contributors in an open source project have exhausted in that project. In other words all the arguments about the scope and extent of patent grants and patent capture in open source licences is entirely unnecessary.

Therefore, all active participants in an Open Source community ipso facto exhaust any patents on the community code as that code is redistributed.

Implications for Proprietary Software

Firstly, it’s important to note that the exhaustion arguments above have no impact on the patentability of software or the validity of software patents in general, just on their enforcement. Secondly, exhaustion is triggered by the unencumbered right to redistribute which is present in all Open Source licences. However, proprietary software doesn’t come with a right to redistribute in the copyright licence, meaning exhaustion likely doesn’t trigger for them. Thus the exhaustion arguments above have no real impact on the ability to enforce software patents in proprietary code except that one possible defence that could be raised is that the code practising the patent in the proprietary software was, in fact, legitimately obtained from an open source project under a permissive licence and thus the patent has exhausted. The solution, obviously, is that if you worry about enforceability of patents in proprietary software, always use a copyleft licence for your open source.

What about the Patent Troll Problem?

Trolls, by their nature, are not IP producing entities, thus they are not ecosystem participants. Therefore trolls, being outside the community, can pursue infringement cases unburdened by exhaustion problems. In theory, this is partially true but Trolls don’t produce anything, therefore they have to acquire their patents from someone who does. That means that if the producer from whom the troll acquired the patent was active in the community, the patent has still likely exhausted. Since the life of a patent is roughly 20 years and mass adoption of open source throughout the software industry is only really 10 years old1 there still may exist patents owned by Trolls that came from corporations before they began to be Open Source players and thus might not be exhausted.

The hope this offers for the Troll problem is that in 10 years time, all these unexhausted patents will have expired and thanks to the onward and upward adoption of open source there really will be no place for Trolls to acquire unexhausted patents to use against the software industry, so the Troll threat is time limited.

A Call to Arms: Realising the Elimination of Patents in Open Source

Your mission, should you choose to be part of this project, is to help advance the legal doctrines on patent exhaustion. In particular, if the company you work for is sued for patent infringement in any Open Source project, even by a troll, suggest they look into asserting an exhaustion based defence. Even if your company isn’t currently under threat of litigation, simply raising awareness of the option of exhaustion can help enormously.

The first case an exhaustion defence could potentially be tried is this one: Sequoia Technology is asserting a patent against LVM in the Linux kernel. However it turns out that patent 6,718,436 is actually assigned to ETRI, who merely licensed it to Sequoia for the purposes of litigation. ETRI, by the way, is a Linux Foundation member but, more importantly, in 2007 ETRI launched their own distribution of Linux called Booyo which would appear to be evidence that their own actions as a distributor of the Linux Kernel have exhaused patent 6,718,436 in Linux long before they ever licensed it to Sequoia.

If we get this right, in 10 years the Patent threat in Open Source could be history, which would be a nice little legacy to leave our children.

Webauthn in Linux with a TPM via the HID gadget

Account security on the modern web is a bit of a nightmare. Everyone understands the need for strong passwords which are different for each account, but managing them is problematic because the human mind just can’t remember hundreds of complete gibberish words so everyone uses a password manager (which, lets admit it, for a lot of people is to write it down). A solution to this problem has long been something called two factor authentication (2FA) which authenticates you by something you know (like a password) and something you posses (like a TPM or a USB token). The problem has always been that you ideally need a different 2FA for each website, so that a compromise of one website doesn’t lead to the compromise of all your accounts.

Enter webauthn. This is designed as a 2FA protocol that uses public key cryptography instead of shared secrets and also uses a different public/private key pair for each website. Thus aspiring to be a passwordless secure scalable 2FA system for the web. However, the webauthn standard only specifies how the protocol works when the browser communicates with the remote website, there’s a different standard called FIDO or U2F that specifies how the browser communicates with the second factor (called an authenticator in FIDO speak) and how that second factor works.

It turns out that the FIDO standards do specify a TPM as one possible backend, so what, you might ask does this have to do with the Linux Gadget subsystem? The answer, it turns out, is that although the standards do recommend a TPM as the second factor, they don’t specify how to connect to one. The only connection protocols in the Client To Authenticator Protocol (CTAP) specifications are USB, BLE and NFC. And, in fact, the only one that’s really widely implemented in browsers is USB, so if you want to connect your laptop’s TPM to a browser it’s going to have to go over USB meaning you need a Linux USB gadget. Conspiracy theorists will obviously notice that if the main current connector is USB and FIDO requires new USB tokens because it’s a new standard then webauthn is a boon to token manufacturers.

How does Webauthn Work?

The protocol comes in two flavours, version 1 and version 2. Version 1 is fixed cryptography and version 2 is agile cryptography. However, version1 is simpler so that’s the one I’ll explain.

Webauthn essentially consists of two phases: a registration phase where the authenticator is tied to the account, which often happens when the remote account is created, and authentication where the authenticator is used to log in again to the website after registration. Obviously accounts often outlive the second factor, especially if it’s tied to a machine like the TPM, so the standard contemplates a single account having multiple registered authenticators.

The registration request consists of a random challenge supplied by the remote website to prevent replay and an application id which is constructed by the browser from the website supplied ID and the web origin of the site. The design is that the application ID should be unique for each remote account and not subject to being faked by the remote site to trick you into giving up some other application’s credentials.

The authenticator’s response consists of a unique public key, an opaque key handle, an attestation X.509 certificate containing a public key and a signature over the challenge, the application ID, the public key and the key handle using the private key of the certificate. The remote website can verify this signature against the certificate to verify registration. Additionally, Google recommends that the website also verifies the attestation certificate against a list of know device master certificates to prove it is talking to a genuine U2F authenticator. Since no-one is currently maintaining a database of “genuine” second factor master certificates, this last step mostly isn’t done today.

In version 1, the only key scheme allowed is Elliptic Curve over the NIST P-256 curve. This means that the public key is always 65 bytes long and an encrypted (or wrapped) form of the private key can be stashed inside the opaque key handle, which may be a maximum of 255 bytes. Since the key handle must be presented for each authentication attempt, it relieves the second factor from having to remember an ever increasing list of public/private key pairs because all it needs to do is unwrap the private key from the opaque handle and perform the signature and then forget the unwrapped private key. Note that this means per user account authenticator, the remote website must store the public key and the key handle, meaning about 300 bytes extra, but that’s peanuts compared to the amount of information remote websites usually store per registered account.

To perform an authentication the remote website presents a unique challenge, the raw ID from which the browser should construct the same application ID and the key handle. Ideally the authenticator should verify that the application ID matches the one used for registration (so it should be part of the wrapped key handle) and then perform a signature over the application ID, the challenge and a unique monotonically increasing counter number which is sent back in the response. To validly authenticate, the remote website verifies the signature is genuine and that the count has increased from the last time authentication has done (so it has to store the per authenticator 4 byte count as well). Any increase is fine, so each second factor only needs to maintain a single monotonically increasing counter to use for every registered site.

Problems with Webauthn and the TPM

The primary problem is the attestation certificate, which is actually an issue for the whole protocol. TPMs are actually designed to do attestation correctly, which means providing proof of being a genuine TPM without compromising the user’s privacy. The way they do this is via a somewhat complex attestation protocol involving a privacy CA. The problem they’re seeking to avoid is that if you present the same certificate every time you use the device for registration you can be tracked via that certificate and your privacy is compromised. The way the TPM gets around this is that you can use a privacy CA to produce an arbitrary number of different certificates for the same TPM and you could present a new one every time, thus leaving nothing to be tracked by.

The ability to track users by certificate has been a criticism levelled at FIDO and the best the alliance can come up with is the idea that perhaps you batch the attestation certificates, so the same certificate is used in hundreds of new keys.

The problem for TPMs though is that until FIDO devices use proper privacy CA based attestation, the best you can do is generate a separate self signed attestation certificate. The reason is that the TPM does contain its own certificate, but it’s encryption only, not signing because of the way the TPM privacy CA based attestation works. Thus, even if you were willing to give up your privacy you can’t use the TPM EK certificate as the FIDO attestation certificate. Plus, if Google carries out its threat to verify attestation certificates, this scheme is no longer going to work.

Aside about Browsers and CTAP

The crypto aware among you will recognise that there is already a library based standard that can be used to talk to a variety of USB tokens and even the TPM called PKCS#11. Mozilla Firefox, for instance, already supports using this as I demonstrated in a previous blog post. One might think, based on what I said about the one token per key problem in the introduction, that PKCS#11 can’t support the new key wrapping based aspect of FIDO but, in fact, it can via the C_WrapKey/C_UnwrapKey API. The only thing PKCS#11 can’t do is the new monotonic counter.

Even if PKCS#11 can’t perform all the necessary functions, what about a new or extended library based protocol? This is a good question to which I’ve been unable to get a satisfactory answer. Certainly doing CTAP correctly requires that your browser be able to speak directly to the USB, Bluetooth and NFC subsystems. Perhaps not too hard for a single platform browser like Internet Explorer, but fraught with platform complexity for generic browsers like FireFox where the only solution is to have a rust based accessor for every supported platform.

Certainly the lack of a library interface are where the TPM issues come from, because without that we have to plug the TPM based FIDO layer into a browser over an existing CTAP protocol it supports, i.e. USB. Fortunately Linux has the USB Gadget subsystem which fits the bill precisely.

Building Synthetic HID Devices with USB Gadget

Before you try this at home, I should point out that the Linux HID Gadget has a longstanding bug that will cause your machine to hang unless you have this patch applied. You have been warned!

The HID subsystem is for driving Human Interaction Devices meaning keyboard and mice. However, it has a simple packet (called report in USB speak) based protocol which is easy for most things to use. In order to facilitate this, Linux actually provides hidraw devices which allow you to send and receive these reports using read and write system calls (which, in fact, is how Firefox on Linux speaks CTAP). What the hid gadget does when set up is provide all the static emulation of HID device protocols (like discovery pages) while allowing you to send and receive the hidraw packets over the /dev/hidgX device tap, also via read and write (essentially operating like a tty/pty pair1). To get the whole thing running, the final piece of the puzzle is that the browser (most likely running as you) needs to be able to speak to the hidraw device, so you need a udev rule to make it accessible because by default they’re 0600. Since the same goes for every other USB security token, you’ll find the template in the same rpm that installs the PKCS#11 library for the token.

The way CTAP works is that every transaction is split into 64 byte reports and sent over the hidraw interface. All you need to do to get this setup is initialise a report descriptor for this type of device. Since it’s somewhat cumbersome to do, I’ve created this script to do it (run it as root). Once you have this, the hidraw and hidg devices will appear (make them both user accessible with chmod 666) and then all you need is a programme to drive the hidg device and you’re done.

A TPM Based Hid Gadget Driver

Note: this section is written describing TPM 2.0.

The first thing we need out of the TPM is a monotonic counter, but all TPMs have NV counter indexes which can be created (all TPM counters are 8 byte, whereas the CTAP protocol requires 4 bytes, but we simply chop off the top 4 bytes). By convention I create the counter at NV index 01000101. Once created, this counter will be persistent and monotonic for the lifetime of the TPM.

The next thing you need is an attestation certificate and key. These must be NIST P-256 based, but it’s easy to get openssl to create them

openssl genpkey -algorithm EC -pkeyopt ec_paramgen_curve:prime256v1 -pkeyopt ec_param_enc:named_curve -out reg_key.key

openssl req -new -x509 -subj '/CN=My Fido Token/' -key reg_key.key -out reg_key.der -outform DER

This creates a self signed certificate, but you could also create a certificate chain this way.

Finally, we need the TPM to generate one NIST P-256 key pair per registration. Here we use the TPM2_Create() call which gets the TPM to create a random asymmetric key pair and return the public and wrapped private pieces. We can simply bundle these up and return them as the key handle (fortunately, what the TPM spits back for a NIST P-256 key is about 190 bytes when properly marshalled). When the remote end requests an authentication, we extract the TPM key from the key handle and use a TPM2_Load to place it in the TPM and sign the hash and then unload it from the TPM. Putting this all together this project (which is highly experimental) provides the script to create the devices and a hidg driver that interfaces to the TPM. All you need to do is run it as

hidgd /dev/hidg0 reg_key.der reg_key.key

And you’re good to go. If you want to test it there are plenty of public domain webauthn test sites, webauthn.org and webauthn.io2 are two I’ve tested as working.

TODO Items

The webauthn standard specifies the USB authenticator should ask for permission before performing either registration or authentication. Currently the TPM hid gadget doesn’t have any external verification, but in future I’ll add a configurable pinentry to add confirmation and possibly also a single password for verification.

The current code also does nothing to verify the application ID on a per authorization basis. This is a security problem because you are currently vulnerable to being spoofed by malicious websites who could hand you a snooped key handle and then use the signature to fake your login to a different site. To avoid this, I’m planning to use the policy area of the TPM key to hold the application ID. This should work because the generated keys have no authorization, either policy or password, so the policy area is effectively redundant. It is in the unwrapped public key, but if any part of the public key is tampered with the TPM will detect this via a hash in the wrapped private error and give a binding error on load.

The current code really only does version 1 of the FIDO protocol. Ideally it needs upgrading to version 2. However, there’s not really much point because for all the crypto agility, most TPMs on the market today can only do NIST P-256 curves, so you wouldn’t gain that much.

Conclusions

Using this scheme you’re ready to play with FIDO/U2F as long as you have a laptop with a functional TPM 2.0 and a working USB gadget subsystem. If you want to play, please remember to make sure you have the gadget patch applied.

Using TPM Based Client Certificates on Firefox and Apache

One of the useful features of Apache (or indeed any competent web server) is the ability to use client side certificates. All this means is that a certificate from each end of the TLS transaction is verified: the browser verifies the website certificate, but the website requires the client also to present one and verifies it. Using client certificates, when linked to your own client certificate CA gives web transactions the strength of two factor authentication if you do it on the login page. I use this feature quite a lot for all the admin features my own website does. With apache it’s really simple to turn on with the

SSLCACertificateFile

Directive which allows you to specify the CA for the accepted certificates. In my own setup I have my own self signed certificate as CA and then all the authority certificates use it as the issuer. You can turn Client Certificate verification on per location basis simply by doing

<Location /some/web/location>
SSLVerifyClient require
</Location

And Apache will take care of requesting the client certificate and verifying it against the CA. The only caveat here is that TLSv1.3 currently fails to work for this, so you have to disable it with

SSLProtocol -TLSv1.3

Client Certificates in Firefox

Firefox is somewhat hard to handle for SSL because it includes its own hand written mozilla secure sockets code, which has a toolkit quite unlike any other ssl toolkit1. In order to import a client certificate and key into firefox, you need to create a pkcs12 file containing them and import that into the “Your Certificates” box which is under Preferences > Privacy & Security > View Certificates

Obviously, simply supplying a key file to firefox presents security issues because you’d like to prevent a clever hacker from gaining access to it and thus running off with your client certificate. Firefox achieves a modicum of security by doing all key operations over the PKCS#11 API via a software token, which should mean that even malicious javascript cannot gain access to your key but merely the signing API

However, assuming you don’t quite trust this software separation, you need to store your client signing key in a secure vault like a TPM to make sure no web hacker can gain access to it. Various crypto system connectors, like the OpenSSL TPM2 and TPM2 engine, already exist but because Firefox uses its own crytographic code it can’t take advantage of them. In fact, the only external object the Firefox crypto code can use is a PKCS#11 module.

Aside about TPM2 and PKCS#11

The design of PKCS#11 is that it is a loadable library which can find and enumerate keys and certificates in some type of hardware device like a USB Key or a PCI attached HSM. However, since the connector is simply a library, nothing requires it connect to something physical and the OpenDNSSEC project actually produces a purely software based cryptographic token. In theory, then, it should be easy

The problems come with the PKCS#11 expectation of key residency: The library allows the consuming program to enumerate a list of slots each of which may, or may not, be occupied by a single token. Each token may contain one or more keys and certificates. Now the TPM does have a concept of a key resident in NV memory, which is directly analagous to the PKCS#11 concept of a token based key. The problems start with the TPM2 PC Client Profile which recommends this NV area be about 512 bytes, which is big enough for all of one key and thus not very scalable. In fact, the imagined use case of the TPM is with volatile keys which are demand loaded.

Demand loaded keys map very nicely to the OpenSSL idea of a key file, which is why OpenSSL TPM engines are very easy to understand and use, but they don’t map at all into the concept of token resident keys. The closest interface PKCS#11 has for handling key files is the provisioning calls, but even there they’re designed for placing keys inside tokens and, once provisioned, the keys are expected to be non-volatile. Worse still, very few PKCS#11 module consumers actually do provisioning, they mostly leave it up to a separate binary they expect the token producer to supply.

Even if the demand loading problem could be solved, the PKCS#11 API requires quite a bit of additional information about keys, like ids, serial numbers and labels that aren’t present in the standard OpenSSL key files and have to be supplied somehow.

Solving the Key File to PKCS#11 Mismatch

The solution seems reasonably simple: build a standard PKCS#11 library that is driven by a known configuration file. This configuration file can map keys to slots, as required by PKCS#11, and also supply all the missing information. the C_Login() operation is expected to supply a passphrase (or PIN in PKCS#11 speak) so that would be the point at which the private key could be loaded.

One of the interesting features of the above is that, while it could be implemented for the TPM engine only, it can also be implemented as a generic OpenSSL key exporter to PKCS#11 that happens also to take engine keys. That would mean it would work for non-engine keys as well as any engine that exists for OpenSSL … a nice little win.

Building an OpenSSL PKCS#11 Key Exporter

A Token can be built from a very simple ini like configuration file, with the global section setting global properties, like manufacurer id and library description and each individual section being used to instantiate a slot containing one key. We can make the slot name, the id and the label the same if not overridden and use key file directives to load the public and private keys. The serial number seems best constructed from a hash of the public key parameters (again, if not overridden). In order to support engine keys, the token library needs to know which engine to invoke, so I added an engine keyword to tell it.

With that, the mechanics of making the token library work with any OpenSSL key are set, the only thing is to plumb in the PKCS#11 glue API. At this point, I should add that the goal is simply to get keys and tokens working, not to replicate a full featured PKCS#11 API, so you shouldn’t use this as something to test against for a reference implementation (the softhsm2 token is much better for that). However, it should be functional enough to use for storing keys in Firefox (as well as other things, see below).

The current reasonably full featured source code is here, with a reference build using the OpenSUSE Build Service here. I should add that some of the build failures are due to problems with p11-kit and others due to the way Debian gets the wrong engine path for libp11.

At Last: Getting TPM Keys working with Firefox

A final problem with Firefox is that there seems to be no way to import a certificate file for which the private key is located on a token. The only way Firefox seems to support this is if the token contains both the private key and the certificate. At least this is my own project, so some coding later, the token now supports certificates as well.

The next problem is more mundane: generating the certificate and key. Obviously, the safest key is one which has never left the TPM, which means the certificate request needs to be built from it. I chose a CSR type that also includes my name and my machine name for later easy discrimination (and revocation if I ever lose my laptop). This is the sequence of commands for my machine called jarvis.

create_tpm2_key -a key.tpm
openssl req -subj "/CN=James Bottomley/UID=jarivs/" -new -engine tpm2 -keyform engine -key key.tpm -nodes -out jarvis.csr
openssl x509 -in jarvis.csr -req -CA my-ca.crt -engine tpm2 -CAkeyform engine -CAkey my-ca.key -days 3650 -out jarvis.crt

As you can see from the above, the key is first created by the TPM, then that key is used to create a certificate request where the common name is my name and the UID is the machine name (this is just my convention, feel free to use your own) and then finally it’s signed by my own CA, which you’ll notice is also based on a TPM key. Once I have this, I’m free to create an ini file to export it as a token to Firefox

manufacturer id = Firefox Client Cert
library description = Cert for hansen partnership
[mozilla-key]
certificate = /home/jejb/jarvis.crt
private key = /home/jejb/key.tpm
engine = tpm2

All I now need to do is load the PKCS#11 shared object library into Firefox using Settings > Privacy & Security > Security Devices > Load and I have a TPM based client certificate ready for use.

Additional Uses

It turns out once you have a generic PKCS#11 exporter for engine keys, there’s no end of uses for them. One of the most convenient has been using TPM2 keys with gnutls. Although gnutls was quick to adopt TPM 1.2 based keys, it’s been much slower with TPM2 but because gnutls already has a PKCS#11 interface using the p11 kit URI format, you can easily build a config file of all the TPM2 keys you want it to use and simply use them by URI in gnutls.

Unfortunately, this has also lead to some problems, the biggest one being Firefox: Firefox assumes, once you load a PKCS#11 module library, that you want it to use every single key it can find, which is fine until it pops up 10 dialogue boxes each time you start it, one for each key password, particularly if there’s only one key you actually care about it using. This problem doesn’t seem solvable in the Firefox token interface, so the eventual way I did it was to add the ability to specify the config file in the environment (variable OPENSSL_PKCS11_CONF) and modify my xfce Firefox action to set this in the environment pointing at a special configuration file with only Firefox’s key in it.

Conclusions and Future Work

Hopefully I’ve demonstrated this simple PKCS#11 converter can be useful both to keeping Firefox keys safe as well as uses in other things like gnutls. Unfortunately, it turns out that the world wide web is turning against PKCS#11 tokens as having usability problems and is moving on to something called FIDO2 tokens which have the web browser talking directly to the USB token. In my next technical post I hope to explain how you can use the Linux Kernel USB gadget system to connect a TPM up easily as a FIDO2 token so you can use the new passwordless webauthn protocol seamlessly.

Why Microsoft is a good steward for GitHub

There seems to be a lot of hysteria going on in various communities that depend on GitHub for their project hosting around the Microsoft acquisition (just look in the comments here and here).  Obviously a lot of social media ink will be expended on this, so I’d just like to explain why as a committed open source developer, I think this will actually be a good thing.

Firstly, it’s very important to remember that git may be open source, but GitHub isn’t: none of the scripts that run the service have much published source code at all.  It may be a closed source hosting infrastructure that a lot of open source projects rely on but that doesn’t make it open source itself.  So why is GitHub not open source?  Well, it all goes back to the business model.  Notwithstanding fantastic market valuations there are lots of companies that play in the open source ecosystem, like GitHub, which struggle to find a sustainable business model (or even revenue).  This leads to a lot of open closed/open type models like GitHub (the reason GitHub keeps the code closed is so they can sell it to other companies for internal source management) or Docker Enterprise.

Secondly, even if GitHub were fully open source, as I’ve argued in my essays about the GPL, to trust a corporate player in the ecosystem, you need to be able to understand fully its business motivation for being there and verify the business goals align with the community ones.  As long as the business motivation is transparent and aligned with the community, you know you can trust it.  However, most of the new supposedly “open source” companies don’t have clear business models at all, which means their business motivation is anything but transparent.  Paradoxically this means that most of the new corporate idols in the open source ecosystem are remarkably untrustworthy because their business model changes from week to week as they struggle to please their venture capitalist overlords.  There’s no way you can get the transparency necessary for open source trust if the company itself doesn’t know what its business model will be next week.

Finally, this means that companies with well established open source business models and motivations that don’t depend on the whims of VCs are much more trustworthy in open source in the long term.  Although it’s a fairly recent convert, Microsoft is now among these because it’s clearly visible how its conversion from desktop to cloud both requires open source and requires Microsoft to play nicely with open source.  The fact that it has a trust deficit from past actions is a bonus because from the corporate point of view it has to be extra vigilant in maintaining its open source credentials.  The clinching factor is that GitHub is now ancillary to Microsoft’s open source strategy, not its sole means of revenue, so lots of previous less community oriented decisions, like keeping the GitHub code closed source, can be revisited in time as Microsoft seeks to gain community trust.

For the record, I should point out that although I have a github account, I host all my code on kernel.org mostly because the GitHub workflow really annoys me, having spent a lot of time trying to deduce commit motivations in a sparse git commit messages which then require delving into github issues and pull requests only to work out that most of the necessary details are in some private slack back channel well away from public view.  Regardless of who owns GitHub, I don’t see this workflow problem changing any time soon, so I’ll be sticking to my current hosting setup.