《使用射线对数十亿张图像进行高效稳定的扩散预训练.pdf》由会员分享,可在线阅读,更多相关《使用射线对数十亿张图像进行高效稳定的扩散预训练.pdf(37页珍藏版)》请在三个皮匠报告上搜索。
1、2024 Databricks Inc.All rights reservedYunxuan Xiao,Hao Chen(Anyscale Inc.)Yunxuan Xiao,Hao Chen(Anyscale Inc.)DateDate1Efficient Stable Diffusion Efficient Stable Diffusion PrePre-Training on Billions of Training on Billions of Images with RayImages with Ray2024 Databricks Inc.All rights reserved20
2、24 Databricks Inc.All rights reservedYunxuan XiaoYunxuan XiaoSoftware Engineer AnyscaleSoftware Engineer Anyscale-Maintainer of Ray Train and Ray Tune.-Building large-scale distributed training infrastructure.2SpeakersSpeakers2Hao ChenHao ChenStaff Software Engineer AnyscaleStaff Software Engineer A
3、nyscaleTech lead of Ray Data.Early Ray committer.Previously led Ant Groups Ray team that built worlds largest Ray production workloads.2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedWe pre-trained the Stable Diffusion v2 model on 2 billion images for under$40,000.Utili
4、zed Ray Data to efficiently process large datasets with heterogeneous resources and mitigate preprocessing bottlenecks.Conducted scalable,fault-tolerant training with Ray Train,accelerating training throughput by 3x with infrastructure and algorithm optimizations.OverviewOverview42024 Databricks Inc
5、.All rights reserved2024 Databricks Inc.All rights reserved Stable Diffusion Pre-training and Challenges Scalable Data Processing with Ray Data Efficient Distributed Training with Ray Train5ContentContent2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedStable Diffusion P
6、re-training and ChallengesScalable Data Processing with Ray DataEfficient Distributed Training with Ray Train6ContentContent2024 Databricks Inc.All rights reserved2024 Databricks Inc.All rights reservedA pre-trained VAE and a text encoder(OpenCLIP-ViT/H)encodes the input images and text prompts.A tr