1、Accelerate Cloud Training with AlluxioData FunLu Qiu AlluxioLu QiuMachine Learning Engineer AlluxioAlluxio PMC maintainerMaster Data Science GWUResponsible for integrating Alluxio with deep learning Areas:Alluxio fault tolerant system,journal system,metrics system,and POSIX API.Alluxio integration w
2、ith Cloud2AgendaAlluxio and its POSIX API Accelerate cloud training with AlluxioLevel 1 Storage Read AcceleratingLevel 2 Data Preprocessing&TrainingLevel 3 Data Orchestration Layer3Alluxio&its POSIX API4Data Orchestration for Analytics&AI in the CloudAvailable:ALLUXIO 6DATA ACCESSIBILITYConvert from
3、 client-side interface to native storage interfaceALLUXIO 7DATA LOCALITYLocal performance for remote data with intelligent multi-tieringHotWarmColdRAMSSDHDDRead&Write BufferingTransparent to AppPolicies for pinning,promotion/demotion,TTLOn-premisesPublic CloudModel TrainingBig Data ETLBig Data Query
4、ALLUXIO 8METADATA LOCALITYSynchronization of changes across clustersOld File at path/file1-New File at path/file1-Alluxio MasterPolicies for pinning,promotion/demotion,TTLMetadata SynchronizationMutationOn-premisesPublic CloudModel TrainingBig Data ETLBig Data QueryAlluxio POSIX APIAlluxio POSIX API
5、 10HDFS#1Obj StoreNFSHDFS#2Connecting toHDFSAmazon S3AzureGoogle CloudCeph NFSMany moreAccessing Remote/Distributed Data as Local DirectoriesAccelerating Cloud Training with Alluxio11Level 1Accelerating under storage data accessTraining ClustersHotWarmColdRAMSSDHDDRead BufferingTransparent to AppPol
6、icies for pinning,promotion/demotion,TTLUnder Storage Kubernetes Cloud Cluster1.Accelerating under storage data accessOne Click to Mount UFS to AlluxioAll the data locates in s3:/will be cached by Alluxio and provide data locality for training jobs.$bin/alluxio fs mount/s3 s3:/-option aws.accessKeyI