《MACO:用于 DNN 加速器的 HW-Mapping 协同优化框架.pdf》由会员分享,可在线阅读,更多相关《MACO:用于 DNN 加速器的 HW-Mapping 协同优化框架.pdf(22页珍藏版)》请在三个皮匠报告上搜索。
1、MACO:A HW-Mapping Co-optimization Framework for DNN AcceleratorsSpeaker:Wujie Zhong1The Hong Kong University of Science and Technology(Guangzhou)Guangzhou,ChinaCatalogue Introduction Related Works MACO Experiment Conclusion2Introduction DNN accelerators3GPU Tensor CoreTPUIntroduction Design Space Ex
2、ploration The capacity of data buffers The number of PEs The number of MACs Loop boundaries Loop order Tradeoff between power,performance and area(PPA)4Hardware SpaceMappingSpaceIntroduction Hardware Space Exploration5The architecture of a CNN accelerator:Simba MICRO19Introduction Hardware Space Exp
3、loration Explore the hardware parameters Computation bound More PEs or more MACs Memory bound Higher bandwidth and larger buffers6Introduction Mapping Space Exploration Loop nest7Introduction Mapping Space Exploration Six Memory Levels L0:PE Weight Register Level L1:PE Accumulator Buffer Level L2:PE
4、 Weight Buffer Level L3:PE Input Buffer Level L4:Global Buffer Level L5:DRAM Level8Introduction Mapping Space Exploration9A part of an example about mapping a convolution layer into a Simba-like chipletIntroduction Mapping Space Exploration Buffer Capacity Constraint10PE Weight Buffer:2 2 2 2 Relate
5、d Works Mapping Space Exploration Timeloop ISPASS19:exhaustive and random search Challenge:huge design space GAMMA ICCAD20:genetic algorithm CoSA ISCA21:Mixed Integer Programming(MIP)LEMON CF23:Mixed Integer Programming(MIP)11Related Works Hardware-Mapping Co-optimization DiGamma DATE22:genetic algo
6、rithm Target on a two-level memory hardware DOSA MICRO23:gradient-based methods Target on a single objective MEDEA DATE22:genetic algorithm Suboptimal solutions12MACO Overview13MACO Hardware Space Search Block Multi-objective Bayesian optimization(MOBO)Evaluat