《Talkingdata:Alluxio - 开源AI和大数据存储编排平台(36页).pdf》由会员分享,可在线阅读,更多相关《Talkingdata:Alluxio - 开源AI和大数据存储编排平台(36页).pdf(36页珍藏版)》请在三个皮匠报告上搜索。
1、Alluxio - 开源AI和大数据存储编排平台 顾 荣 Alluxio PMC & Maintainer 南京大学 计算机系副研究员、博士 提 纲 1. Alluxio项目&系统简介 2. Alluxio 2.0新特性概览 3. Alluxio未来发展趋势快览 4. 总结 数据处理的四大趋势驱动了新型基础架构的需求 Separation of Compute & Storage Hybrid Multi cloud environments Self-service data across the enterprise Rise o
2、f the object store Data Ecosystem - BetaData Ecosystem 1.0 COMPUTE STORAGESTORAGE COMPUTE 大数据之路与企业创新的选择 同置 (Co-located ) Co-located compute & HDFS on the same cluster Disaggregated compute & HDFS on the same cluster MR / Hive HDFS Hive HDFS 分散 (Disaggregated) Burst HDFS data in t
3、he cloud, public or private Support Presto, Spark and other computes without app changes Enable & accelerate big data on object stores 向对象存储过渡 混合云化部署HDFS 支持更多计算框架 技术转变中的挑战 Accessing data over WAN too slow Copying data to compute cloud time consuming and complex Using anot
4、her storage system like S3 means expensive application changes Using S3 via HDFS connector leads to extremely low performance 混合云部署HDFS Copying data to multiple compute clouds time consuming and error prone Migrating applications for new storage systems is complex & time consuming Storing and ma
5、naging multiple copies of the data becomes expensive 支持更多计算框架 Object stores performance for big data workloads can be very poor No native support for popular frameworks Expensive metadata operations reduce performance even more No support for hybrid environments directly 向对象存储过渡 12
6、/2/19 7 计算与存储实现独立可扩展性 FUSE Compatible File SystemHadoop Compatible File SystemNative Key-Value InterfaceNative File System Unifying Data at Memory Speed GlusterFSInterfaceAmazon S3 InterfaceSwift InterfaceHDFS Interface Alluxio: a Virtual Distributed File System (VDFS) Java File APIHDFS