《AMD下一代“Zen 5”内核.pdf》由会员分享,可在线阅读,更多相关《AMD下一代“Zen 5”内核.pdf(27页珍藏版)》请在三个皮匠报告上搜索。
1、Next Generation“Zen 5”Core Brad Cohen and Mahesh SubramonyCo-author:Mike ClarkHot Chips 20242|NEXT GENERATION“ZEN 5”CORE|AUGUST 2024“ZEN 4”+14%IPC over“Zen 3”AVX-512 on FP-256 1M L2 VNNI and BFLOAT16 5nm/4nm“ZEN 5”+16%IPC3 over“Zen 4”AVX-512 variants,FP-512 8-wide dispatch,6 ALU Dual pipe fetch/deco
2、de 4nm/3nm20242020The“ZEN”Path“ZEN 3”+19%IPC3 over“Zen 2”8-core complex 32/16MB L3 per complex CET shadow stack 7nm/6nm*1.R5K-003;2.:EPYC-038;3.all results are up to.See endnote GNR-033|NEXT GENERATION“ZEN 5”CORE|AUGUST 2024Performance Deliver another major 1T and 2T performance increase Balanced cr
3、oss-core 1T and 2T instruction and data throughput Create front end parallelism Increased execution parallelism High throughput,efficient data movement and prefetching AVX512 with 512bit FP datapath for throughput and AI upliftNew Capabilities Additional ISA extensions New security featuresPlatform
4、Support Deliver“Zen 5”and“Zen 5c”core variants Support configurable FP512/FP256 datapath Support scaling and energy efficiencyDesign Objectives4|NEXT GENERATION“ZEN 5”CORE|AUGUST 2024“Zen 5”Microarchitecture Overview2 Threads/CoreNextGen Branch PredictorCaches I-Cache:32KB,8-way;2x 32B fetch/cycle O
5、p-Cache:6K inst;2x 6-wide fetch/cycle D-Cache:48KB,12-way;4 mem ops/cycle L2-Cache:1MB,16-wayDual I-Fetch/decode pipes,4 inst/pipe8 ops/cycle dispatched to Integer or FPExecution capabilities 6 integer ALU 4 AGU,4 addresses to LS per cycle 4 FP ops/cycle;2cycle FADDTLBs L1:64entry ITLB,96entry DTLB
6、L2:2K ITLB;4K DTLB everything but 1GALUMulALUMulALUBrALUBrAGU AGUL1D Cache48KB 12-way4 read,2 write64B fill,64B victimL2 Cache1MB 16-waySchedulerALUMulALUBrSchedulerAGUGeneral-Purpose Registers,64b,240-entryInteger Rename,8-wideVector Rename,6-wideSch,38-entryFMULFMAFADDFMULFMAFADDAGUStD IntDVector