针对大规模人工智能系统优化的内存技术:带宽、容量和连接性.pdf

编号:1012004 PDF 24页 2.69MB 下载积分:VIP专享
下载报告请您先登录!

针对大规模人工智能系统优化的内存技术:带宽、容量和连接性.pdf

1、Memory technology optimized for at-scale AI systemsSiamak Tavallaei,Sr.Principal Engineer,Samsung Semiconductor,Inc.MillindMittal,Founder,MemWize.AISERVER:COMPOSABLE MEMORY SYSTEMS(CMS)Levels of memory tiers in AI infrastructure Memory growth drivers and mapping of workloads to memory tiersExample S

2、W frameworks for AI LLM inference A candidate cluster architecture for memory scaling Considerations and role of optics in addressing memory scaling challenge Outline Baseline Server NodeBaseline Server NodeSRAM/Cache T0CPU-MemT1Local Node StorageT2Storage on DC NetworkT3M:Local DDRx MemoryC:CPUS:NV

3、Me/PCIe SSD StorageN:NICMemory TiersHigh-BWLow-latencyLarger CapacityNetworked Bulk CapacityAI Infra Memory Tiers SRAM/CacheT0GPU-HBMT1CPU-Mem(+CXL)T2(T2+)Storage on SOT3-SOGPU-HBM-SUT1-SUCPU-Mem-SU(+CXL)T2(T2+)-SUStorageT3Storage on SUT3-SUStorage on DC Network T4Sever-centric Memory Tier Pyramid v

4、s.AI Infra Memory tier PyramidsScale-up(SU)and Scale-out(SO)FabricsBaseline Server NodeAI Infrastructure SRAM/Cache T0CPU-MemT1Local Node StorageT2Storage on DC NetworkT3SRAM/Cache T0CPU-MemT1CPU-CXL Fabric MemoryT2-CXLCPU-CXL MemT1+Storage on DC NetworkT4Local Node StorageT3Memory SOT2-SOServer Nod

5、e with memory expansion Remote Memory T1(+)SO/DCReasoning and multi-modal models,and Agentic AI driving accelerated growth in memory capacity and bandwidthMultiple fold increase in active KV contextsLonger lived contexts multi-turn,shared contexts(e.g.code generation)Growing knowledge DBs and databa

6、ses of past-conversations Growing models sizes Multi-terabyte capacity for embeddings for recommendation modelsGrowing size of memory-resident modelsMixture-of-experts,Collection-of-experts,.Growing Memory Capacity Needs Mapping AI use cases to Memory TiersScratch pad for computation units Model wei

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(针对大规模人工智能系统优化的内存技术:带宽、容量和连接性.pdf)为本站 (明日何其多) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
折叠