Monthly Archives: February 2015

Anatomy of the UEFI Boot Sequence on the Intel Galileo

The Basics

UEFI boot officially has three phases (SEC, PEI and DXE).  However, the DXE phase is divided into DXEBoot and DXERuntime (the former is eliminated after the call to ExitBootSerivices()).  The jobs of each phase are

  1. SEC (SECurity phase). This contains all the CPU initialisation code from the cold boot entry point on.  It’s job is to set the system up far enough to find, validate, install and run the PEI.
  2. PEI (Pre-Efi Initialization phase).  This configures the entire platform and then loads and boots the DXE.
  3. DXE (Driver eXecution Environment).  This is where the UEFI system loads drivers for configured devices, if necessary; mounts drives and finds and executes the boot code.  After control is transferred to the boot OS, the DXERuntime stays resident to handle any OS to UEFI calls.

How it works on Quark

This all sounds very simple (and very like the way an OS like Linux boots up).  However, there’s a very crucial difference: The platform really is completely unconfigured when SEC begins.  In particular it won’t have any main memory, so you begin in a read only environment until you can configure some memory.  C code can’t begin executing until you’ve at least found enough writable memory for a stack, so the SEC begins in hand crafted assembly until it can set up a stack.

On all x86 processors (including the Quark), power on begins execution in 16 bit mode at the ResetVector (0xfffffff0). As a helping hand, the default power on bus routing has the top 128KB of memory mapped into the top of SPI flash (read only, of course) via a PCI routing in the Legacy Bridge, meaning that the reset vector executes directly from the SPI Flash (this is actually very slow: SPI means Serial Peripheral Interface, so every byte of SPI flash has to be read serially into the instruction cache before it can be executed).

The hand crafted assembly clears the cache, transitions to Flat32 bit execution mode and sets up the necessary x86 descriptor tables.  It turns out that memory configuration on the Quark SoC is fairly involved and complex so, in order to spare the programmer from having to do this all in assembly, there’s a small (512kB) static ram chip that can be easily configured, so the last assembly job of the SEC is to configure the eSRAM (to a fixed address at 2GB), set the top as the stack, load the PEI into the base (by reconfiguring the SPI flash mapping to map the entire 8MB flash to the top of memory and then copying the firmware volume containing the PEI) and begin executing.

QuarkPlatform Build Oddities

Usually the PEI code is located by the standard Flash Volume code of UEFI and the build time PCDs (Platform Configuration Database entries) which use the values in the Flash Definition File to build the firmware.  However, the current Quark Platform package has a different style because it rips apart and rebuilds the flash volumes, so instead of using PCDs, it uses something it calls Master Flash Headers (MFHs) which are home grown for Quark.  These are a fixed area of the flash that can be read as a database giving the new volume layout (essentially duplicating what the PCDs would normally have done).  Additionally the Quark adds a non-standard signature header occupying 1k to each flash volume which serves two purposes: For the SECURE_LD case, it actually validates the volume, but for the three items in the firmware that don’t have flash headers (the kernel, the initrd and the grub config) it serves to give the lengths of each.

Laying out Flash Rom

This is a really big deal for most embedded systems because the amount of flash available is really limited.  The Galileo board is nice because it supplies 8MB of flash … which is huge in embedded terms.  All flash is divided into Flash Volumes1.  If you look at OVMF for instance, it builds its flash as four volumes: Three for the three SEC, PEI and DXE phases and one for the EFI variables.  In EdkII, flash files are built by the flash definition file (the one with a .fdf ending).  Usually some part of the flash is compressed and has to be inflated into memory (in OVMF this is PEI and DXE) and some are designed to be execute in place (usually SEC).  If you look at the Galileo layout, you see that it has a big SEC phase section (called BOOTROM_OVERRIDE) designed for the top 128kb of the flash , the usual variable area and then five additional sections, two for PEI and DXE and three recovery ones. (and, of course, an additional payload section for the OS that boots from flash).

Embedded Recovery Sections

For embedded devices (and even normal computers) recovery in the face of flash failure (whether from component issues or misupdate of the flash) is really important, so the Galileo follows a two stage fallback process.  The first stage is to detect a critical error signalled by the platform sticky bit, or recovery strap in the SEC and boot up to the fixed phase recovery which tries to locate a recovery capsule on the USB media2. The other recovery is a simple copy of the PEI image for fallback in case the primary PEI image fails (by now you’ll have realised there are three separate but pretty much identical copies of PEI in the flash rom).  One of the first fixes that can be made to the Quark build is to consolidate all of these into a single build description.

Putting it all together: Implementing a compressed PEI Phase

One of the first things I discovered when trying to update the UEFI version to something more modern is that the size of the PEI phase overflows the allowed size of the firmware volume.  This means either redo the flash layout or compress the PEI image.  I chose the latter and this is the story of how it went.

The first problem is that debug prints don’t work in the SEC phase, which is where the changes are going to have to be.  This is a big stumbling block because without debugging, you never know where anything went wrong.  It turns out that UEFI nicely supports this via a special DebugLib that outputs to the serial console, but that the Galileo firmware build has this disabled by this line:

[LibraryClasses.IA32.SEC]
...
 DebugLib|MdePkg/Library/BaseDebugLibNull/BaseDebugLibNull.inf

The BaseDebugLibNull does pretty much what you expect: throws away all Debug messages.  When this is changed to something that outputs messages, the size of the PEI image explodes again, mainly because Stage1 has all the SEC phase code in it.  The fix here is only to enable debugging inside the QuarkResetVector SEC phase code.  You can do this in the .dsc file with

 QuarkPlatformPkg/Cpu/Sec/ResetVector/QuarkResetVector.inf {
   <LibraryClasses>
     DebugLib|MdePkg/Library/BaseDebugLibSerialPort/BaseDebugLibSerialPort.inf
 }

And now debugging works in the SEC phase!

It turns out that a compressed PEI is possible but somewhat more involved than I imagined so that will be the subject of the next blog post.  For now I’ll round out with other oddities I discovered along the way

Quark Platform SEC and PEI Oddities

On the current quark build, the SEC phase is designed to be installed into the bootrom from 0xfffe 0000 to 0xffff ffff.  This contains a special copy of the reset vector (In theory it contains the PEI key validation for SECURE_LD, but in practise the verifiers are hard coded to return success).  The next oddity is that the stage1 image, which should really just be the PEI core actually contains another boot from scratch SEC phase, except this one is based on the standard IA32 reset vector code plus a magic QuarkSecLib and then the PEI code.  This causes the stage1 bring up to be different as well, because usually, the SEC code locates the PEI core in stage1 and loads, relocates and executes it starting from the entry point PeiCore().  However, quark doesn’t do this at all.  It relies on the Firmware Volume generator populating the first ZeroVector (an area occupying the first 16 bytes of the Firmware Volume Header) with a magic entry (located in the ResetVector via the magic string ‘SPI Entry Point ‘ with the trailing space).  The SEC code indirects through the ZeroVector to this code and effectively re-initialises the stack and begins executing the new SEC code, which then locates the internal copy of the PEI core and jumps to it.