《具有分层结构和奇偶数据映射的高并行内存 NTT 引擎.pdf》由会员分享,可在线阅读,更多相关《具有分层结构和奇偶数据映射的高并行内存 NTT 引擎.pdf(27页珍藏版)》请在三个皮匠报告上搜索。
1、High-Parallel In-Memory NTT Engine with Hierarchical Structure and Even-Odd Data Mapping Institute of Microelectronics,Chinese Academy of Sciences1Capital Normal University2Institute of Computing Technology,Chinese Academy of Sciences3,University of Chinese Academy Sciences4Bing Li1,Huaijun Liu2,Yib
2、o Du3,4,Ying Wang3,4OutlineBackground and MotivationProposed Method Overview Architecture&Data MappingEvaluation and ResultsConclusionFully Homomorphic EncryptionMedical TreatmentCloud ComputingMachine LearningFitness App FHE ReviewViand A,et al.,S&P 2021 Data Security Powerful Functionality High Co
3、mputational OverheadClassic NTT Challenges&Advantagesa0A0!#a4A1-1a2A2!#a6A3-1a1A4!#a5A5-1a3A6!#a7A7-1!#$#!#$#!#%#$#-1-1-1-1-1-1-1-1Stage1Stage2Stage3Algorithm In-Place Cooley-Tukey-based NTTInput:a=(an1,.,a0)R,n-th root of unity in%with bit-reversed orderOutput:A=NTT(a)in bit-reversed order1:=2:fo
4、r(=1;=2)do3:=/24:for(=0;n-1 3.t2=t1 mu4.t3=t2 n+15.r1=c%2n+16.r2=(t3 q)%2n+17.r=r1-r2Condition:r q/2?(r-q):rReturn rImplementing in CIMCalculation:r=c mod q(q:n bit)1.x=cn-1;2.a=x q/2?(r-q):rReturn rOptimizationMod Algorithm Optimization Adapt the original Barrett algorithm to the efficient implemen
5、tation on CIM111010111111111000000001110101000000001110101Right shift000000001110101cxxa829,qn=829,qn=MSBMSBLSBLSB111010111111111c000000000001011na000000100000000t00b1000000000000000000000011111111()tb+()cna+Sub(a)Shift in CIM(b)Subtraction in CIM Low Latency Low Energy Left shiftMod Algorithm Optim
6、izationMod Module-Data MappingRTLA0,msbA0,lsbA3,msbA3,lsbSense AmplifierSubArray0SubArray64Read/Write&ComparatorWL Decoder&DriverSense AmplifierSubArray128Sense AmplifierSubArray192Sense AmplifierSubArray191Sense AmplifierSubArray255Sense AmplifierMOD PEMOD PEResult A0qmsbqlsbqmsbqlsbResult A3RTLA25