《黄东旭-大数据技术将何去何从?在HTAP的趋势下.pdf》由会员分享,可在线阅读,更多相关《黄东旭-大数据技术将何去何从?在HTAP的趋势下.pdf(31页珍藏版)》请在三个皮匠报告上搜索。
1、Where will BigData technology go?under the trend of HTAPDongxu Huang,PingCAPAbout meAbout me Dongxu(Ed)Huang CTO,PingCAP Database/Distributed System Big fan of OSS,TiDB/TiKV/Codis When we talk about Big Data,what we really talk about?When we talk about Big Data,what we really talk about?Distributed
2、storage?Distributed computing?Hadoop?OLAP databases?In good old daysIn good old days Database is the only source of truth.Once Once u upon pon a timea time we have OLTP&OLAPWhen Distributed System meets OLAPWhen Distributed System meets OLAP Once apon a time,everything is becoming bigger&bigger Goog
3、le borrow map/reduce from FP to save the world Google File System+MapReduce Why FS?Not SQL?Then,we have HadoopOLAP on Hadoop ecologyOLAP on Hadoop ecology In the Hadoop world:HDFS HIVE Bring SQL to MR Whats the problem of HIVE?SO SLOW!Need for speed:From MR to DAGNeed for speed:From MR to DAG Spark
4、RDD(Spark SQL&Dataframe)+DAG On the other sideOn the other side Distributed Computing+SQL MPP MOLAP(Data Cube)Vectorized Computing+Columnar Whats wrong with MPP Database?In-place Update?How about RealHow about Real-time Update?time Update?Columnar vs Row-based format Lambda/Kappa Architecture Unavio
5、d ETL Flink/Spark streaming/Kafka ODS-DWD-DWSWhats wrong with LambdaWhats wrong with Lambda ArchitectureArchitecture We need Real-time!Flexibility,its not easy to update ETL process Data bloat from temporary storage of intermediate resultsLets take a look at the other side:OLTP worldLets take a look
6、 at the other side:OLTP world From a humble start:How to provide online services for massive amounts of data?The need of OLTP:Low-latency CRUD Consistency ResilienceWWhen Distributed System meets hen Distributed System meets OLTPOLTPRDBMS ShardingNoSQLThe rise of Distributed SQL database(in early da