《HotChips34 - Groq - Abts - final.pdf》由会员分享,可在线阅读,更多相关《HotChips34 - Groq - Abts - final.pdf(69页珍藏版)》请在三个皮匠报告上搜索。
1、 2022 Groq,Inc.|PublicHotChips34-2022The Groq Software-defined Scale-out Tensor Streaming MultiprocessorFrom chips-to-systems architectural overview 2022 Groq,Inc.|PublicHotChips34-2022Dennis AbtsChief Architect&Groq F2Dennis Abts,John Kim,Garrin Kimmell,Matthew Boyd,Kris Kang,Sahil Parmar,Andrew Li
2、ng,Andrew Bitar,Ibrahim Ahmed,Jonathan Ross 2022 Groq,Inc.|PublicHotChips34-20223Outline01Tensor Streaming Processor(TSP)Background02Software-defined Hardware and Deterministic Execution03TSP Microarchitecture04System Packaging,Topology,Routing,and Flow Control05Summary 2022 Groq,Inc.|PublicHotChips
3、34-20224The Software-defined ApproachHardware-software co-design is nothing new What we are doing is re-examining the hardware-software interfacesStatic-dynamic Interface:what is performed at“compile time”(statically)versus“execution time”(dynamically).This interface is managed by the runtime layer.
4、Hardware-software Interface:what architectural state is“visible”to the compiler such that we can can reason about correctness and providing predictable performance“Nodes”in the computational graph represent operators and“edges”are the operands and results Operators fire only when all their input ope
5、rands are availableMachine learning models are a good fit for thisstatic analysis and deterministic executionHardware-software InterfaceSoftwareRuntime SystemTSP HardwareParallelizing CompilerCoreMLPyTorchTensorFlowCustom ApplicationsKerasXGBoostScikit-learnModel ConvertersONNXMLIRAssemblerBare-meta
6、l Programming Interfacehardware-softwareInterfaceExceptionHandling 2022 Groq,Inc.|PublicHotChips34-20225Designing for DeterminismBuilding hardware to be an efficient compiler targetDesign choices along the way need to accommodate the“design for determinism”design philosophyHardware must enable the c