Questions from the “Compute Express Link™ (CXL™) Link-level Integrity and Data Encryption" Webinar
By Raghu Makaram and David Harriman
The recent “Compute Express Link™ (CXL™) Link-level Integrity and Data Encryption (CXL IDE)” webinar explored CXL IDE usage models and how security is managed across CXL.io, CXL.mem, CXL.cache and CXL Switches. The webinar also explored a device’s responsibility to maintain security. If you missed the live webinar, the recording is available on BrightTALK and YouTube. The presentation is also available for download on the CXL Consortium’s website.
We received several questions during the Q&A portion of the webinar and below is a recap of the questions and answers discussed during the webinar.
Q: The figure on slide 11, if a key switch (exchange) process happened between MAC_EPOCH1 and MAC_EPOCH2, when exactly will the Initialization Vector (IV) counter be configured at each (TX/RX) port? Is the RX side still using the old IV value to decrypt the first FLIT of the MAC_EPOCH2?
In the definition of AES-GCM, all the FLITs in the Media Access Control (MAC) Epoch must be processed together. That means there is one key and one Initialization Vector (IV) for all the FLITs in the Epoch. The key switch must happen at the boundaries of the MAC Epoch. So, all the yellow FLITs would use one key and one IV. The green FLITs would use a different key and IV. The MAC for the yellow FLITs, even though it’s coming later, would have been processed with the IV and the key from the previous Epoch. One way to address some of these nuances would be to do a truncated MAC and end the MAC Epoch prior to key switch.
Q: With CXL IDE, what is the impact on the latency of CXL.mem read and write commands?
The latency impact would depend on the mode of IDE used. During the presentation, we mentioned that there are two modes of aggregation – containment mode and skid mode. When skid mode is used, there is little to no latency impact because as soon as the FLIT
is received it is decrypted and released for further processing, so the latency is minimal. In containment mode, the FLITs have been held back until the MAC check completes so there would be additional latency.
Q: When keys are changed on the fly, what will the MAC be calculated on?
The MAC is always based on the keys and Initialization Vector (IV) that were used to protect the FLITs in the given MAC EPOCH. If you look at how AES-GCM defines the actual FLIT content that gets mapped into additional authentication data and plain text, this is processed with a key and an IV to generate the MAC, so this whole unit goes together as one. The key switch happens, new FLITs will get encrypted with the new key, they will be integrity protected with the new key.
Q: Are there requirements to distinguish IDE-related link resets from non-IDE or implementation-specific resets?
In general, for any reset that occurs when IDE is enabled and getting used, secrets that are present (from previous reset epoch) must be protected. Once you come back up from the reset, IDE keys must be cleared, and any plain text and additional crypto state present will also have to be cleared.
Q: Does the required traffic have to be back-to-back? If it is sequential traffic, how is the IDE expected to behave?
Traffic doesn’t need to be back-to-back; there can be bursts of traffic. For example, you could send FLIT 1, and then there may be some amount of time before FLIT 2 arrives. The FLITs are ordered at the link level, so they get sent in that order and the receiver sees the same order that the transmitter sent. There may be practical reasons, for example, if the link has been idle for a long time, then you may need to terminate the MAC and send a TMAC (Truncated MAC Flit) to complete the MAC Epoch early and avoid large latencies in containment mode.
Q: Why is a selective stream not supported in CXL.cache?
The key observation is that in PCI Express®, the TLP (transaction layer packet), which is the routable unit, is the basis for IDE. Setting aggregation aside for the moment, the connection end-to-end between two ports implementing selective IDE are “simple” because you take the entire unit that comes into the switch and determine which port it goes out and pass it out. CXL has a very different structure, where you will be applying the MAC across a relatively large number of FLITs which potentially are all routing to different places. This essentially precludes having a selective IDE mechanism applied to this. On the contrary, what if you applied the MAC to each individual FLIT? It would have a significant impact on the usable bandwidth, which is not desirable. This is something that is a result of the natural difference between a FLIT-based protocol like CXL Cache/mem and a larger packetized protocol like PCI Express, which then applies to CXL.io as well.
Q: eDPC link reset – does it mean CXL.cachemem IDE needs to be restored after an eDPC link reset of CXL.io
In general IDE connections on CXL.io and CXL.cachemem are managed independently. As a general statement, they are not tied to one another. Reset cases, of which there are many, require thoughtful treatment in a platform and application in a specific way. For example, if any configurations that are managed via CXL.io are changed it will probably in turn impact the CXL.cachemem connection. In many cases, when down port containment is triggered, it would mean the physical link is going down or device is getting unplugged, so the entire device is going away and is not specific to just the CXL.io protocol. In a system, that is carefully thought out, it is possible to ensure that when IDE on one protocol stack goes down the other is not impacted, then it is possible to keep it alive. At the protocol level, CXL.io and CXL.cachemem are independent entities. But that does not mean they are independent at the usage scenario/application level. IDE is just protecting the connection. It is necessary to have appropriate security policies at both ends of this connection. If the security properties of CXL.io change and that impacts the security posture of the CXL.cachemem connection, then the device or host must handle it appropriately. But that is outside the preview of IDE.
Q: How is the monotonic counter on each side of the link used?
The monotonic counters use part of the Initialization Vector (IV) construction as shown in slide 9. When you look at AES-GCM, the IV is updated every time a MAC is transmitted. This means that when the transmitter encrypts and computes a MAC, it’s using the keys and a particular value of the IV. The receiver also has the same key and needs to align to the same value of IV, otherwise, the computation will come out incorrect and the integrity verification will fail. The use of monotonic counter is used to ensure the IV stays in sync without needing to send IV across the link.
Q: Is it possible for a device using CXL.mem or CXL.cache to optionally not participate with the security protocol?
CXL.mem and CXL.cache are both coherency protocols that are coupled by the CXL.cachemem IDE. For example, with a type 2 device, in a given FLIT, there could be some cache transactions and some mem transactions. In that case, it is not possible to separate those out independently. But if the device was purely a type 3 device, it would only need to implement in the CXL.mem protocol and that would get protected by IDE.
Q: Is IDE optional or if not implemented on both sides, will the feature be negotiated at the start?
CXL.io has its IDE capability and CXL.cachemem has its IDE capability. They are both independent and optional. If an application requires security that IDE provides, then that application can’t meaningfully work without IDE being supported on both sides and turned on. Both the device and the host are required to enforce a security policy that is appropriate for whatever they are doing. For example, a device that requires IDE for secure operation would be obligated to determine that IDE is not only supported but enabled and operating before it enters any secure operational mode. If IDE were to go down, then it would disable that secure operational mode. This is essentially an upper-level protocol requirement.
Q: Will a switch have to interleave GCM streams with different keys from/to each endpoint? Or is this programmed once and then “static” until a key is refreshed?
It would depend on the protocol. CXL.io IDE can have multiple streams. TLP belonging to different streams are protected by different keys. It would naturally flow through the switch, but the switch would not be doing anything special. For CXL.cache and CXL.mem, it is a link-level protocol, so the link IDE needs one set of keys between the root port to the switch and different set of keys from the switch to the endpoint. All the FLITs through CXL.cache and CXL.mem will be protected by the responding IDE.
Q: Is there a "debug mode" for CXL/PCIe-LAI to handle IDE?
LAI – logic Analyzer/test equipment. This is essentially outside the scope of the PCI Express® (PCIe®) and CXL IDE definitions. We are working on a proposal for a uniform way of handling this. The difficult item here is that test equipment would typically trigger on information that is now actually going to be encrypted. Even just to display what is happening on the link is necessary to be able to decrypt packets which means being able to agree on the key, either because there is a set of agreed-upon keys that are arranged in advance that are specifically for debug or because there is a mechanism to essentially share the key to the test equipment. That can be done offline. The more difficult problem is, if you want to trigger on the information that’s encrypted, then that processing must be done in real-time or close to real-time.
Q: In skid mode the FLITs are released without waiting for the MAC. While this is good for bandwidth and latency, how does it work if the MAC is found to be incorrect? Does this not increase security risk?
Agree with the observation. There is an additional security risk that needs to be carefully weighed against the benefits. This needs to be carefully thought out considering latency and bandwidth needs.
Q: Can you specify what is expected from IDE firmware implemented outside CXL IP. For example, what is negotiated over the mailbox of CXL.io? Where is the definition for key exchange and agreeing on starting counter?
There is an ECN for the CXL.cache and CXL.mem protocol, which discusses the key establishment protocol. For the CXL.io, the use of DOE mailboxes and how keys are negotiated are specified in the PCIe IDE ECN. This presentation walked through device attestation and key exchange flows. The expectation is that there would be some firmware or a device security manager that is involved in negotiating the keys and ensure they are programmed into the crypto engines.
Additionally, PCIe IDE established a key management protocol which is the starting point. In CXL, the capabilities of that are somewhat expanded. The idea is to continuously evolve the PCIe and CXL specification so that they are at the same level. We anticipate that both will grow some additional capabilities beyond what is currently defined in the key management protocol.
Q: How will IDE work with a MLD device?
Multiple hosts are connected to a multi-logical devices (MLD) device. For example, when the device is directly connected, there will be multiple links in the device – each link going to one host. In this case, there will be independent IDE on the link with each host. An MLD device needs to understand and manage the implications at a usage model/system level. The Device needs to provide the needed security properties for the solution and needs to determine if it is acceptable to run some links with IDE and others without it. Another system configuration would be that the device gets connected over a switch. In this case, it is the switch that is talking to different hosts, and there is a single link going to device. In this case, device level implications remain.
Q: Is "CXL IDE Establishment ECN" ready now?
CXL IDE establishment ECN is published
Q: What similarities does IDE have to HDCP for video protocols, and where does it fundamentally differ? Where might IDE be evolving in future iterations?
HDCP is defined for content protection of data sent from a transmitter to a specific receiver. So, the HDCP focus is on data encryption to maintain confidentiality. PCIe and CXL IDEs are general purpose capabilities designed to address a variety of usages that require confidentiality, integrity and replay protection of the content transiting the link. With IDE we anticipate a variety of usages such as accelerators, memory devices, networking and storage where there can be fine grained data sharing and data integrity is equally important. One of the important directions that IDE will evolve would be towards adding support for secure compostable systems.
Q: Is there an established certificate-authority ecosystem CMA/SPDM? Where does an OS go to refresh its cert-chain?
Currently there is no single certificate authority for the use with PCIe/ CXL CMA/SPDM or IDE. This means that individual device vendors would be generating their own certificates – either self-signed or the root certificate obtained from CA. The expectation is that this would be managed at a deployment level and some service would likely be required to gather up the needed certificates from the relevant vendors and make it available to the system SW to use.
Compute Express Link™ and CXL™ Consortium are trademarks of the Compute Express Link Consortium.