《YDB dealing with Big Data and moving towards AI Alexander Zevaykin.pdf》由会员分享,可在线阅读,更多相关《YDB dealing with Big Data and moving towards AI Alexander Zevaykin.pdf(50页珍藏版)》请在三个皮匠报告上搜索。
1、Dealing with Big Data and moving towards AI处理大数据,迈向人工智能Alexander Zevaykin,PhDGroup Leader at Yandex Infrastructureydb.tech/zhYandex consists of over 90 services,used by millions of people dailyYandex由90多个服务组成,每天有数百万人使用Yandex builds a lot of its infrastructure in-house Information searchComputer Visi
2、onNeural language models(GPT)Simultaneous translation of AI-based videosSelf-drivingvehiclesCloud technologiesSpeech technologiesCrowdsourcingRouting and navigation technologiesWeather forecasting technology Meteum 2.025700+employeesYandex公司在内部建立了很多基础设施Part1YDB:dealing with Big Data处理大数据6Horizontal
3、scaling横向扩展性ACID transactions in multiple AZ分布式环境保持ACID事务Operability and automatic recovery in case of failures故障时可操作性和自动恢复Scaling by millions of transactions per second and petabytes of data每秒可扩展数百万个事务和PB级数据Open-Source with Apache 2.0 license开源What is YDB?Distributed SQL database for operational an
4、d analytical workloadsYDB是一个开源、分布式、高容错的 SQL 数据库系统,能将高可用性、可扩展性与强一致性和ACID事务相结合它可以同时处理事务性(OLTP)、分析性(OLAP)和流式工作负载 in Yandex2014201420172017Base for Yandex CloudFirst commit202220222024202435000+nodes5000+databases 70+PB storageOpen-Source YDB诞生于Yandex-俄罗斯最大的IT公司,我们已有十年发展历史。8Shared Nothing 我们的基于无共享的架构 Cl
5、uster of bare metal or virtual machines Shared nothing architecture ommodity hardware Cluster both stores the data and process user queriesCompute Storage separation计算和存储节点独立管理 Scalability Cost-efficiency FlexibilityCompute and storage nodes are managed independentlyCompute nodesTabletTabletTabletTa
6、bletTabletTabletTabletTabletTabletTabletTabletTabletStorage nodesTable Partitions Autosplit and Balancing数据表自动拆分,自动平衡 Split by load Split by size YDB evenly distributes table partitions among the nodesMirror-3-dc3 3 availability zones3 3storage factorcopes with the loss of one AZ+one server rack in