《探索近内存处理架构:编程模型的设计挑战和注意事项.pdf》由会员分享,可在线阅读,更多相关《探索近内存处理架构:编程模型的设计挑战和注意事项.pdf(18页珍藏版)》请在三个皮匠报告上搜索。
1、OCP Global Summit October 18,2023|San Jose,CAYoungpyo Joo/Fellow,SK hynixExploring Near-Memory Processing Architecture:Design Challenges and Considerations for Programming ModelsMotivationRapidly increasing computing resource demandsCorresponding absolute and relative increases in server power consu
2、mptionWhy Computational MemoryPower consumption(watts)PerformancePerformance vs.PowerRecalibrating global data center energy-use estimates,Volume:367,Issue:6481,Pages:984-986,Feb.,2020 Why Computational MemoryConventional system:Performance-per-watt saturation Efficiency of performance improvement d
3、ecreasesComputational memory:Memory with Near-Memory Processing(NMP)Reducing data movement between CPU and Memory Energy efficientOffloading inefficient CPU computations Better performanceMemoryand computing in a single solution ScalabilityMoveComputingnearbydataConventional SystemCPUDDR DIMMPCIeCom
4、putational Memory SystemCXLPerformancePowerPerformancePowerEnergy Efficient Scalable PerformanceInefficient CPU computations Low locality(Cache hit rate)Simple and parallel Memory-intensiveCPUDDR DIMMPCIePerformance#CMScalabilitySSDComputational Memory MemorySSDNMPRemove data movements from GM to HM
5、Comparison with Conventional AcceleratorCXL MemoryCPUNMP UnitGlobal Memory(GM)DMACompute UnitDevice MemoryConventional AcceleratorComputational MemoryData source locationHost memoryOperation processDMA transfers source data from HM to GM()Compute Unit computes and stores results to GM()DMA transfers
6、 the result from GM to HM()CPU reads the result from HM()Data source locationCXL MemoryOperation processNMP Unit computes and stores results to CXL Memory()CPU directly reads the result from CXL Memory()Host Memory(HM)Host Memory(HM)CXLPCIeCPURemove data movements from HM to GMCXL-CMS 2.0CPUDDRSK hy