1、Copyright 2016.All rights reserved新一代数据仓库:HAWQCopyright 2016.All rights reserved目录公司简介HAWQ成功案例Copyright 2016.All rights reserved数据生态系统应用用户行为分析、反欺诈、用户画像、信用模型BIQlik,PowerBI分析挖掘机器学习/AISAS,SPSS,TensorflowETLInformaticaTalendKettleOLAP数据仓库数据仓库(Data Warehouse)MPP,SQL-on-Hadoop,NewDataWarehouse数据治理数据安全OLTP
2、关系数据库,NoSQL,NewSQL全球数据仓库市场规模2016年达数百亿美金Cloud(公有云和私有云)920140320913302507072781027052.50%49.04%57.91%53.54%43.55%41.11%0.00%10.00%20.00%30.00%40.00%50.00%60.00%70.00%0200040006000800010000120002014201520162017201820192020?1038169224853517590793471362626.47%63.01%46.87%41.53%67.96%58.69%45.36%0%20%40%6
3、0%80%0500010000150002014201520162017201820192020?Copyright 2016.All rights reserved数据库:55年Database:1962年出现InvertedFileDatabaseSystemSystemDevelopmentCorporation数据库的几个阶段1960s:NavigationalDBMS(网状&层次模型)IntegratedDataStore(IDS)InformationManagementSystem(IMS)1970s-1990s:SQL/RelationalDBMSOLTP,Datawareho
4、use,MPP2000s-Present:PostRelationalNoSQL(XML,KV,Graph,Tree),NewSQL,NewDWCopyright 2016.All rights reserved数据库的核心 数据模型&查询语言 查询优化和执行 索引与存储 事务处理Copyright 2016.All rights reserved关系模型EdgarF.Codd1981 TuringAwardJimGray1998 TuringAwardMichaelStonebraker2014 TuringAward找出住在Harrison的所有客户Select customer_name
5、FromcustomerWherecustomer_city=Harrison;A Relational Model of Data for Large Shared Data Banks.Copyright 2016.All rights reservedGraph/Tree/KV模型Key-ValueCassandra:CQLHBase:APIGraphModelNeo4jGiraph/PregelTreeXMLDatabaseMongoDBStreamingCopyright 2016.All rights reserved其他分类方法 事务处理 vs 分析分析处理处理 并行 vs 串行
6、 硬件:CPU vsGPU vsFPGAvsMemory 云数据库 vs 非云数据库?Copyright 2016.All rights reserved数据仓库的演进MPPDB实例2DB实例1DB实例4DB实例3磁盘磁盘磁盘磁盘share-nothing硬件/软件架构传统数仓传统数仓DB实例2DB实例1DB实例4DB实例3share-storage硬件/软件架构共享存储新一代数仓(New Data Warehouse)DB实例2DB实例1DB实例3分布式文件系统share-nothing硬件架构+软件实现distributed shared-storage磁盘磁盘磁盘硬件配置架构可扩展性缺乏