1、1|2023 AirMettle,Inc.All Rights Reserved.Virtual ConferenceSeptember 28-29,2021Computational Storage ServiceA Real-Time Smart Data LakePresented by:Donpaul Stephens2|2023 AirMettle,Inc.All Rights Reserved.AgendaWhat is Big Data?Computational Storage:challenges Computational Storage ServiceReference
2、Design3|2023 AirMettle,Inc.All Rights Reserved.What is Big Data?Digital Packratism?4|2023 AirMettle,Inc.All Rights Reserved.Unstructured?What is Big Data?5|2023 AirMettle,Inc.All Rights Reserved.Most data is Semi-StructuredEncrypted data is closest to uncompressible white noiseStored in a formatted
3、fileObject!Because historical records can be appended,But you cant rewrite the past,corrections must be trackable6|2023 AirMettle,Inc.All Rights Reserved.How BIG is the data?They dont call it Big Data for nuthin!0.4 to 1GB+per file:Video:1.5 GB to 4GB per hour7|2023 AirMettle,Inc.All Rights Reserved
4、.Extracting insights from Tabular Data(via SQL)Security Information&Event Management Collects sample measurements with certain flags and arguments and groups them by minute.Returns the number of samples,average duration and standard deviation of duration for each group.select to_string(event_ts,yyyy
5、-mm-dd hh24:mi)as interval,count(*),avg(cast(event_dur as int),stddev_samp(cast(event_dur as int)from events where flgs like C_ and regexp_contains(args,JY.)and event_ts between to_timestamp(2000-01-01 00)and to_timestamp(2000-01-01 01)group by interval8|2023 AirMettle,Inc.All Rights Reserved.Extrac
6、ting insights from Tabular Data(via SQL)select sum(lo_revenue),d_year,p_brand1from lineorder,date,part,supplier where lo_orderdate=d_datekey and lo_partkey=p_partkey and lo_suppkey=s_suppkey and p_category=MFGR#12 and s_region=AMERICA group by d_year,p_brand1 order by d_year,p_brand1Star Schema Benc