1、Nilesh Shah,VP Business Development,ZeroPoint TechnologiesChiplet-Based Compressed LLC Cache&Memory ExpansionMemory Challenge:SRAM Leakage,Scaling limitationsAI ArchitecturesTraining:Nvidia GPU L2 cache,HBM subsystem(de)compressionInference:Custom accelerators(Groq,Tenstorrent)Use case:Inference vs
2、trainingMemory capacity+latencyCritical for response timeCompute limited by memory bandwidth wallCost OptimizationMemory Challenge:AISource:NvidiaSource:GroqSource:Tenstorrent2-4X CompressionOpportunity#1|Compressed SRAM64 byte granularity compression/compactionRedundant DataUseful DataHigh Compress
3、ion(ZeroPoint)High Performance(SmartMem)Low LatencySmall area overhead/footprintLow PowerEase of Integration(scratchpad,LLC)Transparent to user(self contained)Cache Compression Opportunity:RequirementsProcess agnostic,PORTABLE IP Solution(De)CompressorTag ManagerTag ArraysDemocratize accessIntegrate
4、 into any SoC,chipletProposed Solution|Cache Compression IP 2-4X Compression ratio across variety of workloads Cache line granularity compression algorithm15-30%performance accelerationLow Latency 5 cycles(ZSD algorithm)Area efficient starting at 0.1mm sq 5nm TSMCOperate at line speed for L2$,L3$,SL
5、CZeroPoint Cache Compression IP Results2-4X Compression ratio across variety of workloadsNuMem NuRAM MRAM bitcell based memory2.5X denser than SRAM,Scales down with process geometry85x-2000 x lower leakage power than SRAM 60-650 x improvement in Latency over DRAM2x HBM Bandwidth(at equivalent#of wir
6、es)Data Retention without power Implementation:Meta Siracusa Extended Reality SoCOpportunity#2:SRAM Alternative-MRAMSource:Meta Siracusa Extended Reality ChipAt-MRAM Neural Engine.2.5X denser than SRAMChiplet Synergy Compression:2-4X effective capacityNuRAM:2.5X capacity ISO area