当前位置:首页 > 报告详情

CXL RAS 固件优先错误处理的应用案例.pdf

上传人: 明**** 编号:1011407 2025-12-21 13页 662.34KB

1、Use Cases for CXL RAS Firmware-First Error HandlingIntel CorporationSTORAGEHarapanahalli,Manjunaatha B-Server Firmware BIOS ArchitectChen,Arvin-Platform Validation EngineerIntel CorporationUse Cases for CXL RAS Firmware-First Error HandlingIntroductionFirmware-First in CXL RAS:Real-World Lessons on

2、Mailbox DesignImpact of SMI Latency on System PerformanceCommon Error Signaling ProtocolsUUID to GUID:Translating CXL Errors with Correct FormatCXL Communication Errors:Boot-Time and Runtime HandlingError Pollution with CXL Error1234567OutlineThe CXL ecosystem comprises of multitude of component ven

3、dors like SoC,Memory,Storage,Networking,etc.The explosive growth of internet content and the resulting data storage and computation requirements has resulted in the deployment of heterogenous and complex solutions in the very large-scale data centers.These warehouse sized buildings are packed with s

4、erver,storage and network hardware.Specifically,if there is an uncorrected fatal error detected by hardware that pose a containment risk.The system needs to be reset and restarted,if possible,to enable continued operation.The error affects the entire CXL device,a persistent/permanent memory device i

5、s considered to have experienced a dirty shut-down.IntroductionThe idea of primary and secondary mailboxis to have FWand OS NOT to step on each other.Firmware first support requires the CXL Memory device to implement a secondary mailbox.There was a challenge to get this support from CXL IHVs in the

6、initial stages.To support the engineering/debug effort,IntelFW team added the option to use the Primary mailboxduring runtime to enable the MEFNfeature to validate with IHVswithout Secondary mailbox support.The primary and secondary Mailbox queues are different,but the Device Status Registeris commo

word格式文档无特别注明外均可编辑修改,预览文件经过压缩,下载原文更清晰!
三个皮匠报告文库所有资源均是客户上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作商用。
根据报告的内容,全文主要围绕CXL(Compute Express Link)技术的RAS(Reliability, Availability, Serviceability)特性及其错误处理展开。以下是关键点: 1. **CXL生态系统**:由多种组件供应商组成,包括SoC、内存、存储和网络等。 2. **错误处理**:CXL RAS采用“固件优先”的错误处理,需要内存设备实现二级邮箱。 3. **SMI延迟**:SMI处理器的延迟是挑战,通过缓存和64位访问可显著降低延迟。 4. **错误信号协议**:CXL使用CPER(Common Platform Error Record)记录错误,并转换UUID为GUID。 5. **通信错误处理**:启动时和运行时通信失败通过EWL(Enhanced Warning Log)和SCI(System Control Interrupt)处理。 6. **错误污染**:单个设备错误可能导致多个报告,需要标准化错误处理流程。 7. **资源**:更多信息和产品可在Intel网站和Compute Express Link官方网站找到。
挑战与解决方案** 如何优化SMI延迟?** 如何避免错误污染?**
客服
商务合作
小程序
服务号
折叠