《018-Fan-Fujie.pdf》由会员分享,可在线阅读,更多相关《018-Fan-Fujie.pdf(14页珍藏版)》请在三个皮匠报告上搜索。
1、Experiences with Extending The RISC-V ISA for Matrix/AIFan Fujie(范福杰范福杰)R&D Director,Stream ComputingCONTENTS Architecture Implementation Eco-system Open-source PlanArchitectureTile-based Matrix Multiplication.RISC-style Instructions&GPR Architecture.Configurable Parameters for Implementations.Separ
2、ate Tile Registers&Accumulation Registers.8 architectural tile registers,tr0 tr7 8 architectural accumulation registers,acc0 acc7Unprivileged CSRs for Type&Size Settings.ArchitectureType system with optional standard extensions.ExtensionType SupportTypical ApplicationsZmi44-bit integerEdge-side AI i
3、nference;LLM InferenceZmi88-bit integerEdge-side AI inference;LLM InferenceZmi1616-bit integerScientific computingZmi3232-bit integerScientific computingZmi6464-bit integerScientific computingZmf8e4m3FP8(E4M3)AI inference&training(forward)Zmf8e5m2FP8(E5M2)AI inference&training(backward)Zmf16e5m10FP1
4、6Cloud-side AI inference&training Zmf16e8m7BF16Cloud-side AI training(or inference)Zmf32e8m23FP32Scientific computingZmf19e8m10TF32Cloud-side AI inference&training Zmf64e11m52FP64Scientific computingArchitectureConfig-setting Instructionsmsettypemsettilemmsettilenmsettilekmsettypeimsettilemimsettile
5、nimsettileki.Load/Store Instructionsmlae*.mmlbe*.mmlce*.mmltre*.mmsae*.mmsbe*.mmsce*.mmstre*.m.Data Move Instructionsmmve*.a.tmmve*.t.ammve*.x.ammve*.x.tmfmve*.f.tmbcar.mmbcaee*.mmtae*.m.Matrix Multiply Instructionsmma.mmmsma.mmmwma.mmmqma.mmmoma.mmmfma.mmmfwma.mmmfqma.mm.https:/ Instructionsmadd.mm
6、msub.mmmmin.mmmmax.mmmfadd.mmmfsub.mmmfmin.mmmfmax.mm.Type-convert Instructionsmcvt.x.xu.mmcvt.xu.x.mmwcvt.xw.x.mmncvt.x.xw.mmfwcvt.fw.f.mmfncvt.f.fw.mmfcvt.x.f.mmfcvt.f.x.m.ArchitectureInstruction Encoding(32-bit)FieldDesc.OP-M32The major opcode(1110111)funct6FunctionimmImmediatelsLoad or storetrTr