《juicefs-csi-in-multi-thousand-node-kubernetes-clusters-for-llm-pre-training-juicefs-csinanollmying-xiao-zhen-zhi-ze-kuberneteszhong-shi-weiwei-zhu-juicedata.pdf》由会员分享,可在线阅读,更多相关《juicefs-csi-in-multi-thousand-node-kubernetes-clusters-for-llm-pre-training-juicefs-csinanollmying-xiao-zhen-zhi-ze-kuberneteszhong-shi-weiwei-zhu-juicedata.pdf(32页珍藏版)》请在三个皮匠报告上搜索。
1、JuiceFS CSI in Multi-Thousand Node Kubernetes Clusters for LLM Pre-Training Weiwei Zhu,JuicedataAbout meWeiwei ZhuFullstack Engineer at JuicedataMaintainer of JuiceFS CSI Driver,Maintainer of FluidAgendaStorage challenges for LLM training in K8sHow JuiceFS addresses these challengesOptimizations for
2、 multi-thousand node clustersDemo with JuiceFS1.What are Storage challenges for LLM training in K8s?LLM ModelsModelParametersSizeGLM9 B5.5 GBYi 1.534 B19 GBqwen272 B41 GBLlama 270 B39 GBLlama 3.170 B40 GBLlama 3.1405 B231 GBStorage for LLM trainingTens of billions of filesMulti-cloud architectureMix
3、 of large and small filesCost controlPOSIX complianceHigh data securityChallengesCharacteristicsStorage for LLM training in K8sData elasticity in elastic clustersData consistency in multi-cloudData security2.How JuiceFS addresses these challengesWhat is JuiceFSCloudNativePOSIX CompatibleDistributedC
4、ompressionStrong ConsistencyOutstanding PerformanceSecurity An open-source,high-performance distributed file system designed for cloud Apache License 2.0 10k+starsWhat is JuiceFSClientHandles all file I/O operationsMultiple protocolsData StorageFile data is split and stored in object storageSupports
5、 almost all types of object storageMetadata EngineStores file metadataA variety of common databases,like Redis,TiKV,MySQL/MariaDB andPostgreSQLAn in-house high-performance metadata engineJuiceFS in KubernetesStatic ProvisionJuiceFS in KubernetesDynamic ProvisionJuiceFS in KubernetesJuiceFS in Server
6、lessData Security in JuiceFS PVFor dynamic PVC,data will be stored in different directories.Data isolationSet passphrase and RSA private key in Secret which is used by StorageClassData encryptionUID and GID of Unix-like systems to manage file permissionsPOSIX ACL permissionsPermi