Viewing public RFDs.
RFD 388
Disk Encryption Keys at Rack Shipment
RFD
388
Updated

Purpose

By the time we ship the rack, we will not have the full-fledged trust quorum available such that we can derive disk encryption keys from the rack cluster secret as described in [rfd301]. However, we still want to encrypt the entire zpool that owns each U.2 device via enabling encryption on the root dataset.

Threat models

Our threat models with regards to disk encryption escalate in terms of the our security capabilities. In other words, we can only handle weaker attacks until trust quorum is fully implemented. We’ll dictate a few threat models below. In all cases our security goal is to ensure that an attacker cannot recover data on stolen drives.

  • L1 - An attacker is capable of stealing an arbitrary number of disk drives.

  • L2 - An attacker is capable of stealing a single sled containing disk drives and is able to boot that sled.

  • L3 - An attacker is capable of stealing fewer than K sleds, where K is the trust quorum size, and booting them such that they communicate.

  • L4 - An attacker is capable of stealing more than K sleds, where K is the trust quorum size, and booting them so that they can communicate.

This document proposes a mechanism to thwart threat model L3 by rack shipment. Threat models L2 and L3 are prevented by some form of trust quorum. L4 is outside of our current long term threat model including trust quorum.

Determination

We ended up implementing and shipping the Low Rent Trust Quorum (LRTQ) as described in the bootstore README. This meets threat model L3 although it is lacking in other ways that the full trust quorum will fix.

We are using the storage key derivation described in RFD 301 section 4, including the U.2 Drive Info. The input key material placed into HKDF to generate the output key material (OKM in [rfd301]) comes from LRTQ.

Alternatives

  1. Leave the disks unencrypted. This is unsatisfying, because it will result in complete disk re-encryption. The actual data is encrypted with ZFS specific random key wrapped by our provided key. Rotation allows making our wrapper key stronger without having to re-encrypt all the data.

  2. Derive keys from cryptographically insecure IKM. This satisfies the L1 threat model, but is underwhelming at best.

  3. Generate and store a large random number in plaintext on the M.2s as the value used as input to HKDF to generate the OKM. We can make this cryptographically secure, but it also opens up the ability for an attacker to steal a U.2 drive and an M.2 device and recover the data.

  4. Generate and store a large random on the M.2s to use as the salt to HKDF with the key being derived from non-cryptographically secure VPD information. This better, but still allows an attacker to recover plaintext data by stealing an M.2 as well as a U.2 drive and discovering the VPD information.

Upgrade into the Trust Quorum model

We will upgrade from LRTQ into more secure trust quorum schemes in an online manner.

Security Considerations

This whole RFD is based around security compromises made to meet our urgency requirements around rack shipment.

LRTQ does not defend against online attacks where the attacker is on the bootstrap network. Other limitations are described in the bootstore README.