《使用MERCEDES、DATABRICKS SQL和PLOTLY RESAMPLER的PB进站.pdf》由会员分享,可在线阅读,更多相关《使用MERCEDES、DATABRICKS SQL和PLOTLY RESAMPLER的PB进站.pdf(19页珍藏版)》请在三个皮匠报告上搜索。
1、2024 Databricks Inc.All rights reserved1Visualizing Visualizing Trillion Point Trillion Point Datasets with Datasets with Databricks and Databricks and PlotlyPlotlySachin SethSachin SethDate 6/13/2024Date 6/13/20242024 Databricks Inc.All rights reserved Purpose:Explore the use of Databricks SQL and
2、Plotly Arrow Resampler(new)for veryvery-largelarge-scale timescale time-series data visualizationseries data visualization(relevant for vehicle test fleet analytics such as at Mercedes)Server side implementation-using Databricks SQL-of sophisticated downsampling algorithmsPass down samples Arrow fil
3、es from Databricks to Plotly Dash Enterprise using Databricks SQL ConnectorsClient sided downsampling using Plotly Arrow Resampler for optimized real time visualization2IntroductionIntroductionShort overview of the presentation and topics we are going to coverShort overview of the presentation and t
4、opics we are going to cover2024 Databricks Inc.All rights reservedBackgroundVehicle test fleets such as Mercedes often include 100s of vehicles(or more)generating terabytes of data(100s of billions of high-fidelity,high-frequency time series data points)daily which include telemetry,diagnostics and
5、usage patterns.Data is ingested into Databricks in varied formats and velocities(from real-time GPS feeds to batch-upload maintenance logs).RequirementMultiple teams of engineers want to(locally and remotely)analyze this data daily with a high degree of interactivity and flexibility(which thus requi
6、res a very high degree of performance in order to work effectively given the scales involved)Directly visualizing raw data from thousands of vehicles is currently impractical 3Analyzing the FleetAnalyzing the FleetDescription of the business problemDescription of the business problem2024 Databricks