Tag Archives: lineageos

Adding Two Factor Authentication to Android (LineageOS)

I really like the idea of using biometrics to add extra security, but have always hated the idea that simply touching the fingerprint sensor would unlock your entire phone, so in my version of LineageOS the touch to unlock feature is disabled but I still use second factor biometrics for the security of various apps. Effectively the android unlock policy is Fingerprint OR PIN/Pattern/Password and I simply want that OR to become an AND.

The problem

The idea of using two factor authentication (2FA) was pioneered by GrapheneOS but since I like the smallness of the Pixel 3 that’s not available to me (plus it only seems to work with pin and fingerprint and my preferred unlock is pattern). However, since I build my own LineageOS anyway (so I can sign and secure boot it) I thought I’d look into adding the feature … porting from GrapheneOS should be easy, right? In fact, when looking in the GrapheneOS code for frameworks/base, there are about nine commits adding the feature:

a7a19bf8fb98 add second factor to fingerprint unlock
5dd0e04f82cd add second factor UI
9cc17fd97296 add second factor to FingerprintService
c92a23473f3f add second factor to LockPattern classes
c504b05c933a add second factor to TrustManagerService
0aa7b9ec8408 add second factor to AdaptiveAuthService
62bbdf359687 add second factor to LockSettingsStateListener
7429cc13f971 add second factor to LockSettingsService
6e2d499a37a2 add second factor to DevicePolicyManagerService

And a diffstat of over 3,000 lines … which seems a bit much for changing an OR to an AND. Of course, the reason it’s so huge is because they didn’t change the OR, they implemented an entirely new bouncer (bouncer being the android term in the code for authorisation gateway) that did pin and fingerprint in addition to the other three bouncers doing pattern, pin and password. So not only would I have to port 3,000 lines of code, but if I want a bouncer doing fingerprint and pattern, I’d have to write it. I mean colour me lazy but that seems way too much work for such an apparently simple change.

Creating a new 2FA unlock

So is it actually easy? The rest of this post documents my quest to find out. Android code itself isn’t always easy to read: being Java it’s object oriented, but the curse of object orientation is that immediately after you’ve written the code, you realise you got the object model wrong and it needs to be refactored … then you realise the same thing after the first refactor and so on until you either go insane or give up. Even worse when many people write the code they all end up with slightly different views of what the object model should be. The result is what you see in Android today: model inconsistency and redundancy which get in the way when you try to understand the code flow simply by reading it. One stroke of luck was that there is actually only a single method all of the unlock types other than fingerprint go through KeyguardSecurityContainerController.showNextSecurityScreenOrFinish() with fingerprint unlocking going via a listener to the KeyguardUpdateMonitorCallback.onBiometricAuthenticated(). And, thanks to already disabling fingerprint only unlock, I know that if I simply stop triggering the latter event, it’s enough to disable fingerprint only unlock and all remaining authentication goes through the former callback. So to implement the required AND function, I just have to do this and check that a fingerprint authentication is also present in showNext.. (handily signalled by KeyguardUpdateMonitor.userUnlockedWithBiometric()). The latter being set fairly late in the sequence that does the onBiometricAuthenticated() callback (so I have to cut it off after this to prevent fingerprint only unlock). As part of the Android redundancy, there’s already a check for fingerprint unlock as its own segment of a big if/else statement in the showNext.. code; it’s probably a vestige from a different fingerprint unlock mechanism but I disabled it when the user enables 2FA just in case. There’s also an insanely complex set of listeners for updating the messages on the lockscreen to guide the user through unlocking, which I decided not to change (if you enable 2FA, you need to know how to use it). Finally, I diverted the code that would call the onBiometricAuthenticated() and instead routed it to onBiometricDetected() which triggers the LockScreen bouncer to pop up, so now you wake your phone, touch the fingerprint to the back, when authenticated, it pops up the bouncer and you enter your pin/pattern/password … neat (and simple)!

Well, not so fast. While the code above works perfectly if the lockscreen is listening for fingerprints, there are two cases where it doesn’t: if the phone is in lockdown or on first boot (because the Android way of not allowing fingerprint only authentication for those cases is not to listen for it). At this stage, my test phone is actually unusable because I can never supply the required fingerprint for 2FA unlocking. Fortunately a rooted adb can update the 2FA in the secure settings service: simply run sqlite3 on /data/system/locksettings.db and flip user_2fa from 1 to 0.

The fingerprint listener is started in KeyguardUpdateMonitor, but it has a fairly huge set of conditions in updateFingerprintListeningState() which is also overloaded by doing detection as well as authentication. In the end it’s not as difficult as it looks: shouldListenForFingerprint needs to be true and runDetect needs to be false. However, even then it doesn’t actually work (although debugging confirms it’s trying to start the fingerprint listening service); after a lot more debugging it turns out that the biometric server process, which runs fingerprint detection and authentication, also has a redundant check for whether the phone is encrypted or in lockdown and refuses to start if it is, which also now needs to return false for 2FA and bingo, it works in all circumstances.

Conclusion

The final diffstat for all of this is

5 files changed, 55 insertions(+), 3 deletions(-)

So I’d say that is way simpler than the GrapheneOS one. All that remains is to add a switch for the setting (under the fingerprint settings) in packages/apps/Settings and it’s done. If you’re brave enough to try this for yourself you can go to my github account and get both the frameworks and settings commits (if you don’t want fingerprint unlock disable when 2FA isn’t selected, you’ll have to remove the head commit in frameworks). I suppose I should also add I’ve up-ported all of my other security stuff and am now on Android-15 (LineageOS-22.2).

Securing a Rooted Android Phone

This article will discuss securing your phone after you’ve rooted it and installed your preferred os (it will not discuss how to root your phone or change the OS). Re-securing your phone requires the installation of a custom AVB key, which not all phones support, so I’ll only be discussing Google Pixel phones (which support this) and the LineageOS distribution (which is the one I use). The reason for wanting to do this is that by default LineageOS runs with the debug keys (i.e. known to everyone) with an unlocked bootloader, meaning OS updates and even binary changes to the system partition are easy to do. Since most android phones are fully locked, this isn’t a standard attack vector for malicious apps, but if someone is targetting you directly it may become one.

This article will cover how android verified boot (AVB) works, how to install your own custom AVB key and get a stock LineageOS distribution to use it and how to turn DM verity back on to make /system immutable.

A Brief Tour of Android Verified Boot (AVB)

We’ll actually be covering the 2.0 version of AVB, but that’s what most phones use today. The proprietary bootloader of a Pixel (the fastboot capable one you get to with adb reboot bootloader) uses a vbmeta partition to find the boot/recovery system and from there either enter recovery or boot the standard OS. vbmeta contains hashes for both this boot partition and the system partition. If your phone is unlocked the bootloader will simply boot the partitions vbmeta points to without any verification. If it is locked, the vbmeta partition must be signed by a key the phone knows. Pixel phones have two keyslots: a built in one which houses either the Google key or an OEM one and the custom slot, which is blank. In the unlocked mode, you can flash your own key into the custom slot using the fastboot flash avb_custom_key command.

The vbmeta partition also contains a boot flags region which tells the bootloader how to boot the OS. The two flags the OS knows about are in external/avb/libavb/avb_vbmeta_image.h:

/* Flags for the vbmeta image.
 *
 * AVB_VBMETA_IMAGE_FLAGS_HASHTREE_DISABLED: If this flag is set,
 * hashtree image verification will be disabled.
 *
 * AVB_VBMETA_IMAGE_FLAGS_VERIFICATION_DISABLED: If this flag is set,
 * verification will be disabled and descriptors will not be parsed.
 */
typedef enum {
  AVB_VBMETA_IMAGE_FLAGS_HASHTREE_DISABLED = (1 << 0),
  AVB_VBMETA_IMAGE_FLAGS_VERIFICATION_DISABLED = (1 << 1)
} AvbVBMetaImageFlags;

if the first flag is set then dm-verity is disabled and if the second one is set, the bootloader doesn’t pass the hash of the vbmeta partition on the kernel command line. In a standard LineageOS build, both these flags are set.

The reason for passing the vbmeta hash on the command line is so the android init process can load the vbmeta partition, hash it and verify against what the bootloader passed in, thus confirming there hasn’t been a time of check to time of use (TOCTOU) security breach. The init process cannot verify the signature for itself because the vbmeta signing public key isn’t built into the OS (which allows the OS to be signed after the images are build).

The description of the AVB_VBMETA_IMAGE_FLAGS_HASHTREE_DISABLED flag is slightly wrong. A standard android build always seems to calculate the dm-verity hash tree and insert it into the vbmeta partition (where it is verified by the vbmeta signature) it’s just that if this flag is set, the android init process won’t load the dm-verity hash tree and the system partition will thus be mutable.

Creating and Using your own custom Boot Key

Obviously android doesn’t use any standard tool form for its keys, so you have to create your own RSA 2048 (the literature implies it will work with 4096 as well but I haven’t tried it) AVB custom key using say openssl, then use avbtool (found in external/avb as a python script) to convert your RSA public key to a form that can be flashed in the phone:

avbtool extract_public_key --key pubkey.pem --output pkmd.bin

This can then be flashed to the unlocked phone (in the bootloader fastboot) with

fastboot flash avb_custom_key pkmd.bin

And you’re all set up to boot a custom signed OS.

Signing your LineageOS

There is a wrinkle to this: to re-sign the OS, you need the target-files.zip intermediate build, not the ROM install file. Unfortunately, this is pretty big (38GB for lineage-19.1) and doesn’t seem to be available for download any more. If you can find it, you can re-sign the stock LineageOS, but if not you have to build it yourself. Instructions for both building and re-signing can be found here. You need to follow this but in addition you must add two extra flags to the sign_target_files_apks command:

--avb_vbmeta_key=/path/to/private/avb.key
--avb_vbmeta_algorithm=SHA256_RSA2048

Which will ensure the vbmeta partition is signed with the key you created above.

Optionally Enabling dm-verity

If you want to enable dm-verity, you have to change the vbmeta flags to 0 (enable both hashtree and vbmeta verification) before you execute the signing command above. These flags are stored in the META/misc_info.txt file which you can extract from target-files.zip with

unzip target-files.zip META/misc_info.txt

And once done you can vi this file to find the line

avb_vbmeta_args=--flags 3 --padding_size 4096 --rollback_index 1804212800

If you update the 3 to 0 this will unset the two disable flags and allow you to do a dm-verity verified boot. Then use zip to replace this updated file

zip -u target-files.zip META/misc_info.txt

And then proceed with signing the updated target-files.zip

Wrinkle for Android-12 (lineage-19.1) and above

For all these versions, this patch ensures that if the vbmeta was signed then the vbmeta hash must be verified, otherwise the system will crash in early init, so you have no choice and must alter the avb_vbmeta_args above to either --flags 1 or --flags 0 so the vbmeta hash is passed in to init. Since you have to alter the flags anyway, you might as well enable dm-verity (set to 0) at the same time.

Re-Lock the Bootloader

Once you have installed both your custom keys and your custom signed boot image, you are ready to re-lock the bootloader. Beware that some phones will erase your data partition when you do this (the Google advice says they shouldn’t, but not all manufacturers bother to follow it), so make sure you have a backup (or are playing with a newly rooted phone).

fastboot flashing lock

Check with a reboot (the phone should now display a yellow warning triangle saying it is booting a custom OS rather than the orange unsigned OS one). If everything goes according to plan you can enter the developer settings and click the “OEM Unlocking” settings to disabled and your phone can no longer be unlocked without your say so.

Conclusions and Caveats

Following the above instructions, you can updated your phone so it will verify images you signed with your AVB key, turn on dm-verity if you wish and generally make your phone much more secure. However, remember that you haven’t erased the original AVB key, so the phone can still be updated to an image signed with that key and, worse, the recovery partition of LineageOS is modified to allow rollback, so it will allow the flashing of any signed image without triggering an erase of the data partition. There are also a few more problems like, thanks to a bug in AOSP, the recovery version of fastboot will actually allow commands that are usually forbidden if the phone is locked.

Debugging Android Early Boot Failures

Back in my blog post about Securing the Google SIP Stack, I did say I’d look at re-enabling SIP in Android-12, so with a view to doing that I tried building and booting LineageOS 19.1, but it crashed really early in the boot sequence (after the boot splash but before the boot animation started). It turns out that information on debugging the android early boot sequence is a bit scarce, so I thought I should write a post about how I did it just in case it helps someone else who’s struggling with a similar early boot problem.

How I usually Build and Boot Android

My builds are standard LineageOS with my patches to fix SIP and not much else. However, I do replace the debug keys with my signing keys and I also have an AVB key installed in the phone’s third party keyslot with which I sign the vbmeta for boot. This actually means that my phone is effectively locked but with a user supplied key (Yellow as google puts it).

My phone is now a pixel 3 (I had to say goodbye to the old Nexus One thanks to the US 3G turn off) and I do have a slightly broken Pixel 3 I play with for experimental patches, which is where I was trying to install Android-12.

Signing Seems to be the Problem

Just to verify my phone could actually boot a stock LineageOS (it could) I had to unlock it and this lead to the discovery that once unlocked, it would also boot my custom rom as well, so whatever was failing in early boot seemed to be connected with the device being locked.

I also discovered an interesting bug in the recovery rom fastboot: If you’re booting locked with your own keys, it will still let you perform all the usually forbidden fastboot commands (the one I was using was set_active). It turns out to be because of a bug in AOSP which treats yellow devices as unlocked in fastboot. Somewhat handy for debugging, but not so hot for security …

And so to Debugging Early Boot

The big problem with Android is there’s no way to get the console messages for early boot. Even if you enable adb early, it doesn’t get started until quite far in to the boot animation (which was way after the crash I was tripping over). However, android does have a pstore (previously ramoops) driver that can give you access to the previously crashed boot’s kernel messages (early init, fortunately, mostly logs to the kernel message log).

Forcing init to crash on failure

Ordinarily an init failure prints a message and reboots (to the bootloader), which doesn’t excite pstore into saving the kernel message log. fortunately there is a boot option (androidboot.init_fatal_panic) which can be set in the boot options (or kernel command line for a pixel-3 which can only boot the 4.9 kernel). If you build your own android, it’s fairly easy to add things to the android commandline (which is in boot.img) because all you need to do is extract BOOT/cmdline from the intermediate zip file you sign add any boot options you need and place it back in the zip file (before you sign it).

Unfortunately, this expedient didn’t work (no console logs appear in pstore). I did check that init was correctly panic’ing on failure by inducing an init failure in recovery mode and observing the panic (recovery mode allows you to run adb). But this induced panic also didn’t show up in pstore, meaning there’s actually some problem with pstore and early panics.

Security is the problem (as usual)

The actual problem turned out to be security (as usual): The pixel-3 does encrypted boot panic logs. The way this seems to work (at least in my reading of the google additional pstore patches) is that the bootloader itself encrypts the pstore ram area with a key on the /data partition, which means it only becomes visible after the device is unlocked. Unfortunately, if you trigger a panic before the device is unlocked (by echoing ‘c’ to /proc/sysrq-trigger) the panic message is lost, so pstore itself is useless for debugging early boot. There seems to be some communication of the keys by the vendor proprietary ramoops binary making it very difficult to figure out how it’s being done.

Why the early panic message is lost is a bit mysterious, but unfortunately pstore on the pixel-3 has several proprietary components around the encrypted message handling that make it hard to debug. I suspect if you don’t set up the pstore encryption keys, the bootloader erases the pstore ram area instead of encrypting it, but I can’t prove that.

Although it might be possible to fix the pstore drivers to preserve the ramoops from before device unlock, the participation of the proprietary bootloader in preserving the memory doesn’t make that look like a promising avenue to explore.

Anatomy of the Pixel-3 Boot Sequence

The Pixel-3 device boots through recovery. What this means is that the initial ramdisk (from boot.img) init is what boots both the recovery and normal boot paths. The only difference is that for recovery (and fastboot), the device stays in the ramdisk and for normal boot it mounts the /system partition and pivots to it. What makes this happen or not is the boot flag androidboot.force_normal_boot=1 which is added by the bootloader. Pretty much all the binary content and init rc files in the ramdisk are for recovery and its allied menus.

Since the boot paths are pretty radically different, because the normal boot first pivots to a first stage before going on to a second, but in the manner of containers, it might be possible to boot recovery first, start a dmesg logger and then re-exec init through the normal path

Forcing Re-Exec

The idea is to signal init to re-exec itself for the normal path. Of course, there have to be a few changes to do this: An item has to be added to the recovery menu to signal init and init itself has to be modified to do the re-exec on the signal (note you can’t just kick off an init with a new command line because init must be pid 1 for booting). Once this is done, there are problems with selinux (it won’t actually allow init to re-exec) and some mount moves. The selinux problem is fixable by switching it from enforcing to permissive (boot option androidboot.selinux=permissive) and the mount moves (which are forbidden if you’re running binaries from the mount points being moved) can instead become bind mounts. The whole patch becomes 31 insertions across 7 files in android_system_core.

The signal I chose was SIGUSR1, which isn’t usually used by anything in the bootloader and the addition of a menu item to recovery to send this signal to init was also another trivial patch. So finally we have a system from which I can start adb to trace the kernel log (adb shell dmesg -w) and then signal to init to re-exec. Surprisingly this worked and produced as the last message fragment:

[ 190.966881] init: [libfs_mgr]Created logical partition system_a on device /dev/block/dm-0
[ 190.967697] init: [libfs_mgr]Created logical partition vendor_a on device /dev/block/dm-1
[ 190.968367] init: [libfs_mgr]Created logical partition product_a on device /dev/block/dm-2
[ 190.969024] init: [libfs_mgr]Created logical partition system_ext_a on device /dev/block/dm-3
[ 190.969067] init: DSU not detected, proceeding with normal boot
[ 190.982957] init: [libfs_avb]Invalid hash size:
[ 190.982967] init: [libfs_avb]Failed to verify vbmeta digest
[ 190.982972] init: [libfs_avb]vbmeta digest error isn't allowed
[ 190.982980] init: Failed to open AvbHandle: No such file or directory
[ 190.982987] init: Failed to setup verity for '/system': No such file or directory
[ 190.982993] init: Failed to mount /system: No such file or directory
[ 190.983030] init: Failed to mount required partitions early …
[ 190.983483] init: InitFatalReboot: signal 6
[ 190.984849] init: #00 pc 0000000000123b38 /system/bin/init
[ 190.984857] init: #01 pc 00000000000bc9a8 /system/bin/init
[ 190.984864] init: #02 pc 000000000001595c /system/lib64/libbase.so
[ 190.984869] init: #03 pc 0000000000014f8c /system/lib64/libbase.so
[ 190.984874] init: #04 pc 00000000000e6984 /system/bin/init
[ 190.984878] init: #05 pc 00000000000aa144 /system/bin/init
[ 190.984883] init: #06 pc 00000000000487dc /system/lib64/libc.so
[ 190.984889] init: Reboot ending, jumping to kernel

Which indicates exactly where the problem is.

Fixing the problem

Once the messages are identified, the problem turns out to be in system/core ec10d3cf6 “libfs_avb: verifying vbmeta digest early”, which is inherited from AOSP and which even says in in it’s commit message “the device will not boot if: 1. The image is signed with FLAGS_VERIFICATION_DISABLED is set 2. The device state is locked” which is basically my boot state, so thanks for that one google. Reverting this commit can be done cleanly and now the signed image boots without a problem.

I note that I could also simply add hashtree verification to my boot, but LineageOS is based on the eng target, which has FLAGS_VERIFICATION_DISABLED built into the main build Makefile. It might be possible to change it, but not easily I’m guessing … although I might try fixing it this way at some point, since it would make my phones much more secure.

Conclusion

Debugging android early boot is still a terribly hard problem. Probably someone with more patience for disassembling proprietary binaries could take apart pixel-3 vendor ramoops and figure out if it’s possible to get a pstore oops log out of early boot (which would be the easiest way to debug problems). But failing that the simple hack to re-exec init worked enough to show me where the problem was (of course, if init had continued longer it would likely have run into other issues caused by the way I hacked it).

Securing the Google SIP Stack

A while ago I mentioned I use Android-10 with the built in SIP stack and that the Google stack was pretty buggy and I had to fix it simply to get it to function without disconnecting all the time. Since then I’ve upported my fixes to Android-11 (the jejb-11 branch in the repositories) by using LineageOS-19.1. However, another major deficiency in the Google SIP stack is its complete lack of security: both the SIP signalling and the media streams are all unencrypted meaning they can be intercepted and tapped by pretty much anyone in the network path running tcpdump. Why this is so, particularly for a company that keeps touting its security credentials is anyone’s guess. I personally suspect they added SIP in Android-4 with a view to basing Google Voice on it, decided later that proprietary VoIP protocols was the way to go but got stuck with people actually using the SIP stack for other calling services so they couldn’t rip it out and instead simply neglected it hoping it would die quietly due to lack of features and updates.

This blog post is a guide to how I took the fully unsecured Google SIP stack and added security to it. It also gives a brief overview of some of the security protocols you need to understand to get secure VoIP working.

What is SIP

What I’m calling SIP (but really a VoIP system using SIP) is a protocol consisting of several pieces. SIP (Session Initiation Protocol), RFC 3261, is really only one piece: it is the “signalling” layer meaning that call initiation, response and parameters are all communicated this way. However, simple SIP isn’t enough for a complete VoIP stack; once a call moves to in progress, there must be an agreement on where the media streams are and how they’re encoded. This piece is called a SDP (Session Description Protocol) agreement and is usually negotiated in the body of the SIP INVITE and response messages and finally once agreement is reached, the actual media stream for call audio goes over a different protocol called RTP (Real-time Transport Protocol).

How Google did SIP

The trick to adding protocols fast is to take them from someone else (if you’re open source, this is encouraged) so google actually chose the NIST-SIP java stack (which later became the JAIN-SIP stack) as the basis for SIP in android. However, that only covered signalling and they had to plumb it in to the android Phone model. One essential glue piece is frameworks/opt/net/voip which supplies the SDP negotiating layer and interfaces the network codec to the phone audio. This isn’t quite enough because the telephony service and the Dialer also need to be involved to do the account setup and call routing. It always interested me that SIP was essentially special cased inside these services and apps instead of being a plug in, but that’s due to the fact that some of the classes that need extending to add phone protocols are internal only; presumably so only manufacturers can add phone features.

Securing SIP

This is pretty easy following the time honoured path of sending messages over TLS instead of in the clear simply by using a TLS wrappering technique of secure sockets and, indeed, this is how RFC 3261 says to do it. However, even this minor re-engineering proved unnecessary because the nist-sip stack was already TLS capable, it simply wasn’t allowed to be activated that way by the configuration options Google presented. A simple 10 line patch in a couple of repositories (external/nist_sip, packages/services/Telephony and frameworks/opt/net/voip) fixed this and the SIP stack messaging was secured leaving only the voice stream insecure.

SDP

As I said above, the google frameworks/opt/net/voip does all the SDP negotiation. This isn’t actually part of SIP. The SDP negotiation is conducted over SIP messages (which means it’s secured thanks to the above) but how this should be done isn’t part of the SIP RFC. Instead SDP has its own RFC 4566 which is what the class follows (mainly for codec and port negotiation). You’d think that if it’s already secured by SIP, there’s no additional problem, but, unfortunately, using SRTP as the audio stream requires the exchange of additional security parameters which added to SDP by RFC 4568. To incorporate this into the Google SIP stack, it has to be integrated into the voip class. The essential additions in this RFC are a separate media description protocol (RTP/SAVP) for the secure stream and the addition of a set of tagged a=crypto: lines for key negotiation.

As will be a common theme: not all of RFC 4568 has to be implemented to get a secure RTP stream. It goes into great detail about key lifetime and master key indexes, neither of which are used by the asterisk SIP stack (which is the one my phone communicates with) so they’re not implemented. Briefly, it is best practice in TLS to rekey the transport periodically, so part of key negotiation should be key lifetime (actually, this isn’t as important to SRTP as it is to TLS, see below, which is why asterisk ignores it) and the people writing the spec thought it would be great to have a set of keys to choose from instead of just a single one (The Master Key Identifier) but realistically that simply adds a load of complexity for not much security benefit and, again, is ignored by asterisk which uses a single key.

In the end, it was a case of adding a new class for parsing the a=crypto: lines of SDP and doing a loop in the audio protocol for RTP/SAVP if TLS were set as the transport. This ended up being a ~400 line patch.

Secure RTP

RTP itself is governed by RFC 3550 which actually contains two separate stream descriptions: the actual media over RTP and a control protocol over RTCP. RTCP is mostly used for multi-party and video calls (where you want reports on reception quality to up/downshift the video resolution) and really serves no purpose for audio, so it isn’t implemented in the Google SIP stack (and isn’t really used by asterisk for audio only either).

When it comes to securing RTP (and RTCP) you’d think the time honoured mechanism (using secure sockets) would have applied although, since RTP is transmitted over UDP, one would have to use DTLS instead of TLS. Apparently the IETF did consider this, but elected to define a new protocol instead (or actually two: SRTP and SRTCP) in RFC 3711. One of the problems with this new protocol is that it also defines a new ciphersuite (AES_CM_…) which isn’t found in any of the standard SSL implementations. Although the AES_CM ciphers are very similar in operation to the AES_GCM ciphers of TLS (Indeed AES_GCM was adopted for SRTP in a later RFC 7714) they were never incorporated into the TLS ciphersuite definition.

So now there are two problems: adding code for the new protocol and performing the new encyrption/decryption scheme. Fortunately, there already exists a library (libsrtp) that can do this and even more fortunately it’s shipped in android (external/libsrtp2) although it looks to be one of those throwaway additions where the library hasn’t really been updated since it was added (for cuttlefish gcastv2) in 2019 and so is still at a pre 2.3.0 version (I did check and there doesn’t look to be any essential bug fixes missing vs upstream, so it seems usable as is).

One of the really great things about libsrtp is that it has srtp_protect and srtp_unprotect functions which transform SRTP to RTP and vice versa, so it’s easily possible to layer this library directly into an existing RTP implementation. When doing this you have to remember that the encryption also includes authentication, so the size of the packet expands which is why the initial allocation size of the buffers has to be increased. One of the not so great things is that it implements all its own crypto primitives including AES and SHA1 (which most cryptographers think is always a bad idea) but on the plus side, it’s the same library asterisk uses so how much of a real problem could this be …

Following the simple layering approach, I constructed a patch to do the RTP<->SRTP transform in the JNI code if a key is passed in, so now everything just works and setting asterisk to SRTP only confirms the phone is able to send and receive encrypted audio streams. This ends up being a ~140 line patch.

So where does DTLS come in?

Anyone spending any time at all looking at protocols which use RTP, like webRTC, sees RTP and DTLS always mentioned in the same breath. Even asterisk has support for DTLS, so why is this? The answer is that if you use RTP outside the SIP framework, you still need a way of agreeing on the keys using SDP. That key agreement must be protected (and can’t go over RTCP because that would cause a chicken and egg problem) so implementations like webRTC use DTLS to exchange the initial SDP offer and answer negotiation. This is actually referred to as DTLS-SRTP even though it’s an initial DTLS handshake followed by SRTP (with no further DTLS in sight). However, this DTLS handshake is completely unnecessary for SIP, since the SDP handshake can be done over TLS protected SIP messaging instead (although I’ve yet to find anyone who can convincingly explain why this initial handshake has to go over DTLS and not TLS like SIP … I suspect it has something to do with wanting the protocol to be all UDP and go over the same initial port).

Conclusion

This whole exercise ended up producing less than 1000 lines in patches and taking a couple of days over Christmas to complete. That’s actually much simpler and way less time than I expected (given the complexity in the RFCs involved), which is why I didn’t look at doing this until now. I suppose the next thing I need to look at is reinserting the SIP stack into Android-12, but I’ll save that for when Android-11 falls out of support.