Tag Archives: openssl3

Converting Engines to OpenSSL-3 Providers

Engines in OpenSSL have a long history of providing new algorithms (Russian GOST hash/signature etc) but they can also be used to interface external crypto tokens (pkcs#11) or even key managers like my own TPM engine. I’ve actually been using my TPM2 engine for nearly a decade so that I no longer have to have an unprotected private keys anywhere on my laptops (including for ssh). The purpose of this post is to look at the differences between Providers and Engines and give advice on the minimum necessary Provider implementation to give back all the Engine functionality. So this post is aimed at Engine developers who wish to convert to Providers rather than giving user advice for either.

TPMs and Engines

TPM2 actually has a remarkable number of algorithms: hashing, symmetric encryption, asymmetric signatures, key derivation, etc. However, most TPMs are connected to the host over very slow busses (usually serial), which means that no-one in their right mind would use a TPM for bulk data operations (like hashing or symmetric encryption) since it will take orders of magnitude longer than if the native CPU did it. Thus from an Engine point of view, the TPM is really only good for guarding private asymmetric keys and doing sign or decrypt operations on them, which are the only capabilities the TPM engine has.

Hashes and Signatures

Although I said above we don’t use the TPM for doing hashes, the TPM2_Sign() routines insist on knowing which hash they’re signing. For ECDSA signatures, this is irrelevant, since the hash type plays no part in the signature (it’s always truncated to key length and converted to a bignum) but for RSA the ASN.1 form of the hash description is part of the toBeSigned data. The problem now is that early TPM2’s only had two hash algorithms (sha1 and sha256) and the engine wanted to be able to use larger hash sizes. The solution was actually easy: lie about the hash size for ECDSA, so always give the hash that’s the width of the key (sha256 for NIST P-256 and sha384 for NIST P-384) and left truncate the passed in hash if larger or left zero pad if smaller.

For RSA, the problem is more acute, since TPM2_Sign() actually takes a raw digest and adds the hash description but the engine code sends down the fully described hash which merely needs to be padded if PKCS1 (PSS data is fully padded when sent down) and encrypted with the private key. The solution to this taken years ago was not to bother with TPM2_Sign() at all for RSA keys but instead to do a Decrypt operation1. This also means that TPM RSA engine keys are marked as decryption keys, not signing keys.

The Engine Itself

Given that the TPM is really only guarding the private keys, it only makes sense to substitute engine functions for the private key operations. Although the TPM can do public key operations, the core OpenSSL routines do them much faster and no information is leaked about the private key by doing them through OpenSSL, so Engine keys were constructed from standard OpenSSL keys by substituting a couple of private key methods from the underlying key types. One thing Engines were really bad at was passing additional parameters at key creation time and doing key wrapping. The result is that most Engines already have a separate tool to create engine keys (create_tpm2_key for the TPM2 engine) because complex arguments are needed for TPM specific things like key policy.

TPM keys are really both public and private keys combined and the public part of the key can be accessed without a password (unlike OpenSSL keys) or even access to the TPM that created the key. However, the engine code doesn’t usually know when only the public part of the key will be required and password prompting is done in OpenSSL at key loading (the TPM doesn’t need a password until key use), so usually after a TPM key is created, the public key is also separately derived using a pkey operation and used as a normal public key.

The final, and most problematic Engine feature, is key loading. Engine keys must be loaded using a special API (ENGINE_load_private_key). OpenSSL built in applications require you to specify the key type (-keyform option) but most well written OpenSSL applications simply try loading the PEM key first, then the DER key then the Engine key (since they all have different APIs), but frequently the Engine key is forgotten leading to the application having to be patched if you want to use them with any engine.

Converting Engines to Providers

The provider API has several pieces which apply to asymmetric key handling: Store, Encode/Decode, Key Management, Signing and Decryption (plus many more if you provide hashes or symmetric algorithms). One thing to remember about the store API is that if you only have file based keys, you should use the generic file store instead. Implementing your own store is only necessary if you also have a URI based input (like PKCS#11). In fact the TPM Engine has a URI for persistent keys, so the TPM store implementation will be dealt with later.

Provider Basics

If a provider is specified on the OpenSSL command line, it will become the sole provider of every algorithm. That means that providers like the TPM2 one, which only fill in a subset of functions cannot operate on their own and must always be used with another provider (usually the default one). After initialization (see below) all provider actions are governed by algorithm tables. One of the key questions for any provider is what to do about algorithm names and properties. Because the TPM2 provider relies on external providers for other algorithms, it must use consistent key names (so “EC” for Elliptic curve and “RSA” for RSA), even though it has only a single key type. There are also elements of the provider key managements, like the way Elliptic Curve keys change name to “ECDSA” for signing and “ECDH” for derivation, which is driven by the key management query operation function. As far as I can tell, this provides no benefit and merely serves to add complexity to the provider, so my provider doesn’t implement these functions and uses the same key names throughout.

The most mysterious string of all is the algorithm property one. The manual gives very little clue as to what should be in it besides “provider=<provider name>”. Empirically it seems to have input, output and structure elements, which are primarily used by encoders and decoders: input can be either der or pem and structure must be the same as the OSSL_OBJECT_PARAM_DATA_STRUCTURE string produced by the der decoder (although you are free to choose any name for this). output is even more varied and the best current list is provided by the source; however the only encoder the TPM2 provider actually provides is the text one.

One of the really nice things about providers is that when OpenSSL is presented with a key to load, every provider will be tried (usually in the order they’re specified on the command line) to decode and load the key. This completely fixes the problem with missing ENGINE_load_private_key() functions is applications because now all applications can use any provider key. This benefit alone is enough to outweigh all the problems of doing the actual conversion to a provider.

Replacing Engine Controls

Engine controls were key/value pairs passed into engines. The TPM2 engine has two: “PIN” for the parent authority and “NVPREFIX” for the prefix which identifies a non-volatile key. Although these can be passed in with the ENGINE_ctrl() functions, they were mostly set in the configuration file. This latter mechanism can be replaced with the provider base callback core_get_params(). Most engine controls actually set global variables and with the provider, they could be placed into the provider context. However, for code sharing it’s easier simply to keep the current globals mechanism.

Initialization and Contexts

Every provider has to have an OSSL_provider_init() routine which fills in a dispatch table and allocates a core context, which is passed in to every other context routine. For a provider, there’s really only one instance, so storing variables in the provider context is really no different (except error handling and actually getting destructors) from using static variables and since the engine used static variables, that’s what we’ll stick with. However, pretty much every routine will need an allocated library context, so it’s easiest to allocate at provider init time and pass it through as the provider context. The dispatch routine must contain a query_operation function, and probably needs a teardown function if you need to use a destructor, but nothing else.

All provider function groups require a newctx() and freectx() call. This is not optional because the current OpenSSL code calls them without checking so they cannot be NULL. Thus for function groups (like encoders and key management) where new contexts aren’t really required it makes sense to use pass through context functions that simply pass through the provider context for newctx() and do nothing for freectx().

The man page implies it is necessary to pick a load of functions from the in argument, but it seems unnecessary for those which the OpenSSL library already provides. I assume it’s something to do with a provider not requiring OpenSSL symbols, but it’s impossible to implement a provider today without relying on other OpenSSL functions than those which can be picked out of the in argument.

Decoders

Decoders are used to convert a read file from PEM to DER (this is essentially the same conversion for every provider, so it is strange you have to do this rather than it being done in the core routines) and then DER to an internal key structure. The remaining decoders take DER in and output a labelled key structure (which is used as a component of the EVP_PKEY), if you do both RSA and EC keys, you need one for each key type and, unfortunately, they must be provided and may not cross decode (the RSA decoder must reject EC keys and vice versa). This is actually required so the OpenSSL core can tell what type of key it has but is a royal pain for things like the TPM where the key DER is identical regardless of key type:

const OSSL_ALGORITHM decoders[] = {
	{ "DER", "provider=tpm2,input=pem", decode_pem_fns },
	{ "RSA", "provider=tpm2,input=der,structure=TPM2", decode_rsa_fns },
	{ "EC", "provider=tpm2,input=der,structure=TPM2", decode_ec_fns },
	{ NULL, NULL, NULL }
};

The decode_pem_fns can be cut and pasted from any provider with the sole exception that you probably have a different PEM guard string that you need to check for.

Then a sample decoder function set looks like:

static const OSSL_DISPATCH decode_rsa_fns[] = {
	{ OSSL_FUNC_DECODER_NEWCTX, (void (*)(void))tpm2_passthrough_newctx },
	{ OSSL_FUNC_DECODER_FREECTX, (void (*)(void))tpm2_passthrough_freectx },
	{ OSSL_FUNC_DECODER_DECODE, (void (*)(void))tpm2_rsa_decode },
	{ 0, NULL }
};

The main job of the DECODER_DECODE function is to take the DER form of the key and convert it to an internal PKEY and send that PKEY up by reference so it can be consumed by a key management load.

Encoders

By and large, engines all come with creation tools for key files, which means that while you could now use the encoder routines to create key files, it’s probably better off to stick with what you have (especially for things like the TPM that can have complex policy statements attached to keys), so you can omit providing any encoder functions at all. The only possible exception is if you want the keys pretty printing, you might consider a text output encoder:

const OSSL_ALGORITHM encoders[] = {
	{ "RSA", "provider=tpm2,output=text", encode_text_fns },
	{ "EC", "provider=tpm2,output=text", encode_text_fns },
	{ NULL, NULL, NULL }
};

Which largely follows the format for decoders:

static const OSSL_DISPATCH encode_text_fns[] = {
	{ OSSL_FUNC_ENCODER_NEWCTX, (void (*)(void))tpm2_passthrough_newctx },
	{ OSSL_FUNC_ENCODER_FREECTX, (void (*)(void))tpm2_passthrough_freectx },
	{ OSSL_FUNC_ENCODER_ENCODE, (void (*)(void))tpm2_encode_text },
	{ 0, NULL }
};

Note: there are many more encode/decode function types you could supply, but the above are the essential ones.

Key Management

Nothing in the key management functions requires the underlying key object to be reference counted since it belongs to an already reference counted EVP_PKEY structure in the OpenSSL generic routines. However, the signature operations can’t be implemented without context duplication and the signature context must contain a reference to the provider key so, depending on how the engine implements keys, duplicating via reference might be easier than duplicating via copy. The minimum functionality to implement is LOAD, FREE and HAS. If you are doing Elliptic Curve derive or reference counting your engine keys, you will also need NEW. You also have to provide both GET_PARAMS and GETTABLE_PARAMS (many key management functions have to implement pairs like this) for at least the BITS, SECURITY_BITS and SIZE properties)2.

You must also implement the EXPORT (and EXPORT_TYPES, which must be provided but has no callers) so that you can convert your engine key to an external public key. Note the EXPORT function must fail if asked to export the private key otherwise the default provider will try to do the private key operations via the exported key as well.

If you need to do Elliptic Curve key derivation you must also implement IMPORT (and IMPORT_TYPES) because the creation of the peer key (even though it’s a public one) will necessarily go through your provider key managment functions.

The HAS function can be problematic because OpenSSL doesn’t assume the interchangeability of public and private keys, even if it is true of the engine. Thus the engine must remember in the decode routines what key selector was used (public, private or both) and make sure to condition HAS on that value.

Signatures

This is one of the most confusing areas for simple signing devices (which don’t do hashing) because you’d assume you can implement NEWCTX, FREECTX, SIGN_INIT and SIGN and be done. Unfortunately, in spite of the fact that all the DIGEST_SIGN_… functions can be implemented in terms of the previous functions and generic hashing, they aren’t, so all providers are required to duplicate hashing and signing functions including constructing the binary ASN.1 for the certificate signature function (via GET_CTX_PARAMS and its pair GETTABLE_CTX_PARAMS). Another issue a sign only token will get into is padding: OpenSSL supports a variety of padding schemes (for RSA) but is deprecating their export, so if your token doesn’t do an expected form of padding, you’ll need to implement that in your provider as well. Recalling that the TPM2 provider uses RSA Decryption for signatures means that the TPM2 provider implementation is entirely responsible for padding all signatures. In order to try to come up with a common solution, I added an opensslmissing directory to my provider under the MIT licence that anyone is free to incorporate into their provider if they end up having the same digest and padding problems I did.

Decryption and Derivation

The final thing a private key provider needs to do is decryption. This is a very different operation between Elliptic Curve and RSA keys, so you need two different operations for each (OSSL_OP_ASYM_CIPHER for RSA and OSSL_OP_KEYEXCH for EC). Each ends up being a slightly special snowflake: RSA because it may need OAEP padding (which the TPM does) but with the most usual cipher being md5 (so OAEP padding with arbitrary mask and hash function is also in opensslmissing), which the TPM doesn’t do. and EC because it requires derivation from another public key. The problem with this latter operation is that because of the way OpenSSL works, the public key must be imported into the provider before it can be used, so you must provide NEW, IMPORT and IMPORT_TYPES routines for key management for this to happen.

Store

The store functions only need to be used if you have to load keys that aren’t file based (for file based keys the default provider file store will load them). For the TPM there are a set of NV Keys with 0x81 MSO prefix that aren’t file based. We load these in the engine with //nvkey:<hex> as the designator (and the //nvkey: prefix is overridable in the config file). To get this to work in the Provider is slightly problematic because the scheme (the //nvkey: prefix) must be specified as the provider algorithm_name which is usually a constant in a static array. This means that the stores actually can’t be static and must have the configuration defined name poked into it before the store is used, but this is relatively easy to arrange this in the OSSL_provider_init() function. Once this is done, it’s relatively easy to create a store. The only really problematic function is the STORE_EOF one, which is designed around files but means you have to keep an eof indicator in the context and update it to be 1 once the load function has complete.

The Provider Recursion Problem

This doesn’t seem to be discussed anywhere else, but it can become a huge issue if your provider depends on another library which also uses OpenSSL. The TPM2 provider depends on either the Intel or IBM TSS libraries and both of those use OpenSSL for cryptographic operations around TPM transport security since both of them use ECDH to derive a seed for session encryption and HMAC. The problem is that ordinarily the providers are called in the order they’re listed, so you always have to specify –provider default –provider tpm2 to make up for the missing public key operations in the TPM2 provider. However, the OpenSSL core operates a cache for the provider operations it has previously found and searches the cache first before doing any other lookups, so if the EC key management routines are cached (as they are if you input a TPM format key) and the default ones aren’t (because inputting TPM format keys requires no public key operations), the next attempt to generate an ephemeral EC key for the ECDH security derivation will find the TPM2 provider first. So say you are doing a signature which requires HMAC security to guard against interposer tampering. The use of ECDH in the HMAC seed derivation will then call back into the provider to do an ECDH operation which also requires session security and will thus call back again into the provider ad infinitum (or at least until stack overflow). The only way to break out of this infinite recursion is to try to prime the cache with the default provider as well as the TPM2 provider, so the tss library functions can find the default provider first. The (absolutely dirty) hack I have to do this is inside the pkey decode function as

	if (alg == TPM_ALG_ECC) {
		EVP_PKEY_CTX *ctx = EVP_PKEY_CTX_new_id(EVP_PKEY_EC, NULL);
		EVP_PKEY_CTX_free(ctx);
	}

Which currently works to break the recursion loop. However it is an unreliable hack because internally the OpenSSL hash bucket implementation orders the method cache by provider address and since the TPM2 provider is dynamically loaded it has a higher address than the OpenSSL default one. However, this will not survive security techniques like Address Space Layout Randomization.

Conclusions

Hopefully I’ve given a rapid (and possibly useful) overview of converting an engine to a provider which will give some pointers about provider conversion to all the engine token implementations out there. Please feel free to repurpose my opensslmissing routines under the MIT licence without any obligations to get them back upstream (although I would be interested in hearing about bugs and feature enhancements). In the end, it was only 1152 lines of C to implement the TPM2 provider (additive on top of the common shared code base with the existing Engine) and 681 lines in opensslmissing, showing firstly that there is still an need for OpenSSL itself to do the missing routines as a provider export and secondly that it really takes a fairly small amount of provider code to wrapper an existing engine implementation provided you’re discriminating about what functions you actually provide. As a final remark I should note that the openssl_tpm2_engine has a fairly extensive test suite which all now pass with the provider implementation as well.