SambaNova SN40L RDU:突破万亿+参数规模Gen AI计算的壁垒.pdf

编号:465014 PDF 24页 1.30MB 下载积分:VIP专享
下载报告请您先登录!

SambaNova SN40L RDU:突破万亿+参数规模Gen AI计算的壁垒.pdf

1、SambaNova SN40L RDU:Breaking the Barrier of Trillion+Parameter Scale Gen AI ComputingRaghu PrabhakarArchitect,SambaNova SystemsHotChips 2024Copyright 2024 SambaNova Systems Inc.SN40L:SambaNovas New Language-Optimized RDU2“Cerulean”Architecture-based Reconfigurable Dataflow Unit1.5 TB High Capacity M

2、emory5nm TSMC3-tier Dataflow Memory1,040 RDU Cores102B Transistors64 GB High Bandwidth Memory520 MB On-Chip Memory638 TFLOPS(bf16)Cerulean SN40L RDUGenerative AI Training and InferenceCopyright 2024 SambaNova Systems Inc.On-Chip SRAM8 GB,PBs per secRDU High Bandwidth Memory 1 TB RDU High Capacity DD

3、R Memory 24 TB1600 GB/s25.6 TB/sHigh throughput inference with caching Low Latency Model Switching(E.g.,0.01s for llama3.1 8B)Dataflow enabled by large On-Chip Memory3SN40L:SambaNovas New Language-Optimized RDU3-tier Memory System with SRAM,HBM,and DDRCopyright 2024 SambaNova Systems Inc.SN40L Chip:

4、Tile Architecture 1040 PCUs and PMUsPCU:Compute unitPMU:Memory unitS:Mesh switchesAGCU:Portal to off-chip memory and IO4Copyright 2024 SambaNova Systems Inc.SN40L PCU Configurable as a systolic array or a SIMD vector unit with M lanes BF16,FP32,INT32,and INT8 compute data types,configurable storage

5、data types Arithmetic,Logical,and Bitwise operations A cross-lane reduction tree(blue)to reduce along the vectorized dimension Tail stage provides transcendental functions,casting,and stochastic rounding capabilities5Copyright 2024 SambaNova Systems Inc.SN40L PMU Programmer managed scratchpad memory

6、 supports concurrent reads and writes Fragmentable,address-generation pipeline that can produce 4 addresses per cycle Data alignment crossbars enable high throughput tensor transformations such as transpose,dilation,downcast Address predication support enables composing multiple PMUs to store a larg

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(SambaNova SN40L RDU:突破万亿+参数规模Gen AI计算的壁垒.pdf)为本站 (com) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
折叠