《Hot Chips 2022 CXL3 Coherence Deep Dive.pdf》由会员分享,可在线阅读,更多相关《Hot Chips 2022 CXL3 Coherence Deep Dive.pdf(48页珍藏版)》请在三个皮匠报告上搜索。
1、PublicCoherence Deep Dive for CXLRob Blankenship Intel Corporation and CXL Protocol Working Group co-chairAugust 2022Copyright|CXL Consortium 2020-Hot Chips 2022 CXL Tutorial Public Coherence/Caching Primer CXL Cache Hierarchy CXL.Cache Deep Dive What is new in CXL3(Device Scaling)CXL.Mem Deep Dive
2、What is new in CXL3 Direct P2P to HDM/Multi-Host CoherenceAgenda8/18/20222Copyright|CXL Consortium 2020-Hot Chips 2022 CXL Tutorial PublicCaching PrimerCopyright|CXL Consortium 2020-Hot Chips 2022 CXL Tutorial 8/18/20223Public Caching temporarily brings data closer to the consumer Improves latency a
3、nd bandwidth using prefetching and/or locality Prefetching:Loading Data into cache before it is required Spatial Locality(locality is space):Access address X then X+n Temporal Locality(locality in Time):Multiple access to the same DataCaching Overview8/18/20224Copyright|CXL Consortium 2020-Hot Chips
4、 2022 CXL Tutorial AcceleratorLocal Data CacheAccess Latency:10nsDedicated Bandwidth:100+GB/sReadDataReadDataHost MemoryAccess Latency:200nsShared Bandwidth:100+GB/sPublic Modern CPUs have 2 or more levels of coherent cache Lower levels(L1),smaller in capacity with lowest latency and highest bandwid
5、th per source.Higher levels(L3),less bandwidth per source but much higher capacity and support more sources Device caches are expected to be up to 1MB.CPU Cache/Memory Hierarchy with CXL8/18/20225Copyright|CXL Consortium 2020-Hot Chips 2022 CXL Tutorial Note:Cache/Memory capacities are examples and
6、not aligned to a specific product.CPU Socket 0CPUL150 KB500 KB L2.10 MB L3(aka LLC)500 KB L210 GB Directly Connected Memory(aka DDR)CXL.Cache.CXL.mem10 GBHome AgentCXL.ioPCIeCPU Socket 1CPUL150 KBCPUL150 KBCPUL150 KBCoherent CPU-to-CPU Symmetric LinksCXL.mem10 GBCXL.CacheCXL.ioPCIeWr Cache50KBDevice