Compute Express Link™ (CXL™) 2.0 Specification: Memory Pooling – Questions from the Webinar Part 1
Updated: May 21, 2021
By Mahesh Wagh and Rick Sodke
The CXL™ 2.0 specification, released in November 2020, includes support for a number of new features, including memory pooling for increased memory utilization while providing memory capacity on demand. In the recent CXL Consortium webinar, we explored how CXL 2.0 supports memory pooling for multiple logical devices (MLD) as well as a single logical device with the help of a CXL switch.
We received many great questions from attendees throughout the live webinar, but we couldn’t answer them all during the Q&A portion. Additional questions we received from attendees are answered below and sorted by topic.
How is data security managed with CXL? CXL 2.0 supports encryption, device attestation/authentication, and key exchange protocols for both CXL.io and CXL.mem. Refer to section 11 of the CXL 2.0 specification for details.
Is there a security wrapper around FM traffic as it can allocate the pooled resources? The connection from FM to the switch can use CXL or PCIe security methods to ensure the communications are secure.
Is there a security protocol associated with the Fabric Manager transactions over MCTP (like SPDM)? SPDM can be used for secure communications over MCTP.
Does unbinding or reallocation of memory cause memory zeroization to enforce security? Not explicitly. Prior to UNBIND the host can clear the memory contents and after the UNBIND the Fabric Manager can clear the memory contents. Good system practice would be to use both methods prior to BINDing to a new host.
Fabric Manager (FM) Questions
Where can we find the FM API? Is that already defined? It is defined in the CXL Specification 2.0 section 7.6.
Are the host to FM messages part of the CXL spec? Who defines that? Communications between the Host and Fabric Manager are outside the scope of the CXL 2.0 Specification. Other standards support these communications.
Is there a plan to have a reference implementation of FM in OpenBMC? The implementation of the FM is outside the scope of the CXL 2.0 specification.
Does Fabric Manager take latency to the device into account as well? How about special purpose memories with built in processing? The Fabric Manager is aware of any attached devices and can access their configuration space prior to binding to determine its capabilities.
Is the Host to FM notification in-band or out of band? There is no path within CXL between the Host and the Fabric Manager so it must be out of band.
If Host 2 is talking to FM OOB, how is the communication back to H1 & H2 done in-band? The best way to think of it is that the Host communicates with the FM out of band, the FM gives commands to the switch using the FM API, and the results of those commands result in the switch or the device sending an interrupt to the host notifying it of changes. These changes can be in the form of PCIe managed hot-plug notification or CXL standard Coherent Device Attribute Table update.
Does the FM process run as a process on one of the hosts or in the switch itself? The Fabric Manager can be a process on an external Baseboard Management Controller, a process running on any Host, or a process running on the Switch CPU.
Is there OS support in-place for kernel and/or application interactions with the FM? Communications between the Host and Fabric Manager are outside the scope of the CXL 2.0 Specification. Other standards support these communications.
What part of FM is implemented in hardware, FM Owned LD or FM API? Are there registers to store Host requests and Device capabilities? We cannot comment on implementations but devices that support MCTP would typically have a firmware component.
For direct connect is there still a FM in place to manage partitioning the pool? Direct attached multi-port memory controllers would need a Fabric Manager connection if resources need to be reallocated between ports. Theoretically the allocations could be static using vendor defined mechanisms and no Fabric Manager would be required.
If the Fabric Manager dies, will that impact the whole memory pool? Can the Fabric Manager be run from multiple entities? If the Fabric Manager becomes unreachable the switch continues operating, but any task requiring Fabric Manager involvement will fail. Redundant Fabric Management is outside the scope of the CXL 2.0 specification.
What happens if the FM or connections between devices/hosts and the FM dies in the middle of such an operation? See above answer.
In a multi-switch topology, is there a notion of a Master FM among the switches? How do the FMs coordinate allocations? See above answer.
Questions related to Switching and Multi-Logical Devices (MLD) will be answered in a subsequent blog post.