Security

To prevent running arbitrary payloads inside a pVM, the Android Virtualization Framework (AVF) uses a layered security approach in which each layer adds additional enforcements. Following is a list of AVF security layers:

  • Android ensures that only those apps with pVM permissions are allowed to create or inspect pVMs.

  • Bootloader – The bootloader ensures that only pVM images signed by Google or device vendors are allowed to boot and respects the Android Verified Boot procedure. This architecture implies apps running pVMs can't bundle their own kernels.

  • pVM provides defense-in-depth, such as with SELinux, for payloads run in the pVM. Defense-in-depth disallows mapping data as executable (neverallow execmem) and ensures that W^X holds for all file types.

Security model

Confidentiality, integrity, and availability (CIA triad), make up a model designed to guide information security policies:

  • Confidentiality is a set of rules that limits access to information.
  • Integrity is the assurance that the information is trustworthy and accurate.
  • Availability is a guarantee of reliable access to the information by authorized entities.

Confidentiality and integrity

Confidentiality stems from the memory isolation properties enforced by pKVM hypervisor. pKVM tracks the memory ownership of individual physical memory pages and any requests from owners for pages to be shared. pKVM ensures that only entitled pVMs (host and guests) have the given page mapped in their stage 2 page tables that are controlled by the hypervisor. This architecture maintains that the contents of memory owned by a pVM remain private unless the owner explicitly shares it with another pVM.

Restrictions for maintaining confidentiality also extend to any entities in the system that perform memory accesses on behalf of pVMs, namely DMA-capable devices and services running in more privileged layers. System-on-Chip (SoC) vendors must satisfy a new set of requirements before they can support pKVM. If not, confidentiality can't be provided.

Integrity applies to data in memory and computation. pVMs can't:

  • Modify each other's memory without consent.
  • Influence each other's CPU state.

These requirements are enforced by the hypervisor. However, problems concerning data integrity also arise with virtual data storage when other solutions must be applied, such as dm-verity or AuthFS.

These principles are no different from process isolation offered by Linux where access to memory pages is controlled with stage 1 page tables and the kernel context-switches between processes. However, the EL2 portion of pKVM, which enforces these properties, has three orders of magnitude less attack surface compared to the entire Linux kernel (roughly 10 thousand versus 20 million lines of code) and therefore offers stronger assurance to use cases that are too sensitive to rely on process isolation.

Given its size, pKVM lends itself to formal verification. We're actively supporting academic research, which aims to formally prove these properties on the actual pKVM binary.

The remainder of this page addresses the confidentiality and integrity guarantees that each component around a pKVM provides.

Hypervisor

pKVM is a KVM-based hypervisor that isolates pVMs and Android into mutually distrusted execution environments. These properties hold in the event of a compromise within any pVM, including the host. Alternative hypervisors that comply with AVF need to provide similar properties.

  • A pVM can't access a page belonging to another entity, such as a pVM or hypervisor, unless explicitly shared by the page owner. This rule includes the host pVM and applies to both CPU and DMA accesses.

  • Before a page used by a pVM is returned to the host, such as when the pVM is destroyed, it's wiped.

  • The memory of all pVMs and the pVM firmware from one device boot is wiped before the OS bootloader runs in the subsequent device boot.

  • When a hardware debugger, such as SJTAG, is attached, a pVM can't access its previously minted keys.

  • The pVM firmware doesn't boot if it can't verify the initial image.

  • The pVM firmware doesn't boot if the integrity of the instance.img is compromised.

  • DICE certificate chain and Compound Device Identifiers (CDIs) provided to a pVM instance can be derived only by that particular instance.

Guest OS

Microdroid is an example of an OS running within a pVM. Microdroid consists of a U-boot-based bootloader, GKI, and a subset of Android userspace, and a payload launcher. These properties hold in the event of a compromise within any pVM, including the host. Alternative OSs running in a pVM should provide similar properties.

  • Microdroid won't boot if boot.img, super.img, vbmeta.img, or vbmeta\_system.img can't be verified.

  • Microdroid won't boot if the APK verification fails.

  • The same Microdroid instance won't boot even if the APK was updated.

  • Microdroid won't boot if any of the APEXes fail the verification.

  • Microdroid won't boot (or boots with a clean initial state) if the instance.img is modified outside of the guest pVM.

  • Microdroid provides attestation to the boot chain.

  • Any (unsigned) modification to the disk images shared with the guest pVM causes an I/O error on the pVM side.

  • DICE certificate chain and CDIs provided to a pVM instance can be derived only by that particular instance.

  • Writes to an encrypted storage volume are confidential, however there is no rollback protection at the granularity of an encryption block. Furthermore, other arbitrary external tampering of a data block causes that block to appear as garbage to Microdroid, rather than being detected explicitly as an I/O error.

Android

These are properties maintained by Android as the host but don't hold true in the event of a host compromise:

  • A guest pVM can't directly interact with (such as to make a vsock connection) other guest pVMs.

  • Only the VirtualizationService in the host pVM can make a communication channel to a pVM.

  • Only the apps that are signed with the platform key can request permission to create, own, or interact with pVMs.

  • The identifier, called a context identifier (CID), used in setting up vsock connections between the host and pVM isn't reused when the host pVM is running. For example, you can't replace a running pVM with another.

Availability

In the context of pVMs, availability refers to the host allocating sufficient resources to guests so guests can perform the tasks they are designed to perform.

The host's responsibilities include scheduling the pVM's virtual CPUs. KVM, unlike conventional Type-1 hypervisors (such as Xen), makes the explicit design decision to delegate workload scheduling to the host kernel. Given the size and complexity of today's schedulers, this design decision significantly reduces the size of the trusted computing base (TCB) and enables the host to make more informed scheduling decisions to optimize performance. However, a malicious host can choose to never schedule a guest.

Similarly, pKVM also delegates physical interrupt handling to the host kernel to reduce complexity of the hypervisor and leave the host in charge of scheduling. Effort is taken to ensure that forwarding of guest interrupts results only in a denial of service (too few, too many, or misrouted interrupts).

Finally, the host's virtual machine monitor (VMM) process is responsible for allocating memory and providing virtual devices, such as a network card. A malicious VMM can withhold resources from the guest.

Although pKVM doesn't provide availability to guests, the design protect the host's availability from malicious guests because the host can always preempt or terminate a guest and reclaim its resources.

Secure boot

Data is tied to instances of a pVM, and secure boot ensures that access to an instance's data can be controlled. The first boot of an instance provisions it by randomly generating a secret salt for the pVM and extracting details, such as verification public keys and hashes, from the loaded images. This information is used to verify subsequent boots of the pVM instance and ensure the instance's secrets are released only to images that pass verification. This process occurs for every loading stage within the pVM: pVM firmware, pVM ABL, Microdroid, and so on.

DICE provides each loading stage with an attestation key pair, the public part of which is certified in the DICE certificate for that stage. This key pair can change between boots, so a sealing secret is also derived that is stable for the VM instance across reboots and, as such, is suitable for protecting persistent state. The sealing secret is highly valuable to the VM so it should not be used directly. Instead, sealing keys should be derived from the sealing secret and the sealing secret should be destroyed as early as possible.

Each stage hands a deterministically encoded CBOR object to the next stage. This object contains secrets and the DICE certificate chain, which contains accumulated status information, such as whether the last stage loaded securely.

Unlocked devices

When a device is unlocked with fastboot oem unlock, user data is wiped. This process protects user data from unauthorized access. Data that is private to a pVM is also invalidated when a device unlocking occurs.

Once unlocked, the owner of the device is free to reflash partitions that are usually protected by verified boot, including the partitions that contain the pKVM implementation. Therefore, pKVM on an unlocked device won't be trusted to uphold the security model.

Remote parties can observe this potentially insecure state by inspecting the device's verified boot state in a key attestation certificate.