《FuriosaAI RNGD:用于可持续人工智能计算的张量收缩处理.pdf》由会员分享,可在线阅读,更多相关《FuriosaAI RNGD:用于可持续人工智能计算的张量收缩处理.pdf(33页珍藏版)》请在三个皮匠报告上搜索。
1、 2024 FuriosaAI Inc.RNGD Tensor Contraction Processor for Sustainable AI ComputingRNGD Tensor Contraction Processor for Sustainable AI ComputingJune Paik,Co-Founder and CEO of FuriosaAI 2024 FuriosaAI Inc.RNGD Tensor Contraction Processor for Sustainable AI Computing01Intro of RNGD,Sustainable AI In
2、ference 02 HW Architecture and Chip Design 03SW Full-stack Optimization Key Points 2024 FuriosaAI Inc.RNGD Tensor Contraction Processor for Sustainable AI ComputingFuriosaAI founded&Launch Gen 1 vision NPU RNGD raw silicon sample arrivalFirst LLM demo 2017-2021 2024 May2024 JulyGPT3 inspired RNGD 20
3、21 RNGD developmentkick off 2022 2024 FuriosaAI Inc.RNGD Tensor Contraction Processor for Sustainable AI ComputingMake AI computing sustainable,enabling access to powerful AI for everyone on EarthOur Mission 2024 FuriosaAI Inc.RNGD Tensor Contraction Processor for Sustainable AI Computing512 TFLOPS6
4、4 TFLOPS(FP8)x 8 processing elements48 GBMemory capacity256 MB SRAM384 TB/s on-chip bandwidth1.5 TB/sMemory bandwidth150 W TDPTargeting air-cooled datacenters2 x HBM3CoWoS-SINT8(512 TOPS),BF16(256 TFLOPS),INT4(1 POPS),FP8(512 TFLOPS)PCIe P2P support For LLMsFeatures For CloudMultiple-instance suppor
5、tVirtualizationSecure boot&model encryption 2024 FuriosaAI Inc.RNGD Tensor Contraction Processor for Sustainable AI ComputingDelivers high-performance LLM workloads,while keeping the power consumption within the 150 watt range 2024 FuriosaAI Inc.RNGD Tensor Contraction Processor for Sustainable AI C
6、omputing2 x HBM3 12-layer(24GB)x 2Silicon InterposerCoWoS-SSoC TSMC 5nmCore CLK:1.0 GHZDie Size:653 mmTransistor count:40 B 2024 FuriosaAI Inc.RNGD Tensor Contraction Processor for Sustainable AI ComputingEarly Performance Numbers:60%higher perf/watt than L40SRNGDNVIDIA L40SIntel Gaudi 2Google TPU v