《网络计算技术加速 GPU 应用.pdf》由会员分享,可在线阅读,更多相关《网络计算技术加速 GPU 应用.pdf(68页珍藏版)》请在三个皮匠报告上搜索。
1、NVIDIAIN-NETWORK COMPUTINGPROGRAMMING FOR GPUAPPLICATIONSGTC CHINA 2020Qingchun SongDecember 2020#page#AGENDAREMOTE DIRECT MEMORY ACCESS (RDMAGPU DIRECT RDMA(GDR)GPU DIRECT STORAGE (GDS) SCALABLE HIERARCHICAL AGGREGATION ANDREDUCTION PROTOCOL (SHARPNCCL SHARP会#page#REMOTE DIRECT MEMORYACCESS (RDMA)#
2、page#INFINIBANDS LAYERED ARCHITECTUREUpper LevelClientClientprotocolsIBAIBAOperationsOperationsTransportMessagesLayerQP)SARSARInter Subnet RoutingNetworkNetworkNetwork(IFv6)LayerPacket5UdLink买LinkyuEncodingEncodingLinkLayerMediaMediaFlowMAAcCOSSACCOSSControrControlControlSignaingPhysicalEnd NodeSwit
3、chRouterEndNodeLayer#page#RDMA DATA TRANSFER MODELQueue Pair (QP)QPs are in pairs (Send/Receive)TransmitReceiveWork Request(WR)RecetvdTransmitRemote QPLocal QPWork items that the HW should performWorkQueueConWork Completion (completion)When a WR is completed,it may create a Work Completionwhich prov
4、ides information about the ended WRWQEWQEWork Queue (WQ)WorkQueueWQEWQEA queue which contains WRsScheduled by the HWCompletiow QueuICQEICQECQECQECan be either Send or Receive QueueCompletion Queue (CQ)#page#RDMA OPERATION: SENDThe responder Post Receive Requests (before datais received)RequesterResb
5、onderThe requester Post Send RequestSyncPost RROnly data is sent over the wirePost SRdataACK is sent only in reliable transport typesACKPollcaPollcQ#page#RDMA OPERATION: RDMA WRITEThe requester Post Send RequestRequesterResponderData and remote memory attributes are sentResponder is passivePostSRKa+
6、ppe+elepImmediate data can be used to consume RRs atthe responder sideACKACK is sent only in reliable transport typesPollcQ#page#RDMA OPERATION: RDMA READThe requester Post Send RequestRequesterResponderData and remote memory attributes are sentResponder is passivePost SRaddr+rkeyData is sent from t