《如何在 GPU 上进行海量数据流的 ETL 处理.pdf》由会员分享,可在线阅读,更多相关《如何在 GPU 上进行海量数据流的 ETL 处理.pdf(42页珍藏版)》请在三个皮匠报告上搜索。
1、#page#“ Winners are those who went throughiterations of the “loop ofmoreprogress- going from an idea, to itsimplementation, to actionable results.So the winning teams are simply thoseable to run through this loop faster.”Francois Chollet, creator of Keras#page#DATA SCIENTIST WORKFLOWThe Average Data
2、 Scientist Spends 90+% of Their Time in ETL as Opposed to Training Modelsanother*#!forgot toaddafeaturerestartETLworkfow12start ETLworkfloCPUswitch to decaPOWEREDcomtlgunWORKFLOWdatasetcollectiondatasetdownloadanalysisvemight143traininference#page#page#GPU-ACCELERATED FEATURE ENGINEERINGResults from
3、 ACM RecSys Challenge 2020 Winners口otherxgbtrain&predictlag featurescount encodingtargetencoding4000357030002000销10201000270138IntelXeonCPU(20cores)1xV1004xV1004xV100+UCXComputation time in seconds for different infrastructureand librariessource:https:/ FEATURE ENGINEERINGIndustry Standard Benchmark
4、: Up to 350x faster queries; Hours to Seconds!10TB ResulsRAPIDS Runningon 16 NVIDIADGXA10OSIndustrystandard data science benchmark consisting of 30end-to-endqueries representing real-world ETLand Machine Learning workflows,involving both structured and unstructured data.ltwas run at twodifferent dat
5、asizes.1TB10TBRAPIDS results at 1TB(2 DGXA100s)and10TB(16DGXA100s)showlarge-scale data analyticsproblemsTotal TirneRAPD3LCa0CT1TB:37.1xaveragespeed-up10TB:19.5xaveragespeed-up(7x Normalizedfor Cost)CuPyblazingsQLNumbaDASK#page#WHY GPUS FOR ETLNumerous HardwareAdvantagesNVIDIADGXA100 SystemDThousands
6、 of cores with up to-20 TeraFlops of general-purpose compute performanceUp to 1.5 TB/s of memory bandwidthDD口Hardware interconnects for up to 600 GB/s bidirectionalGPU GPU bandwidthCan scale up to 8x GPUs in a single nodeAlmost never run out of compute relativeto memory bandwiidth!#page#RAPIDSEnd-to