1、Optimizing Analytics Optimizing Analytics InfrastructureInfrastructureLessons from Migrating Snowflake to DatabricksAmit RustagiAmit RustagiJune 2025Doing more with less is new Doing more with less is new imperativeimperative.Migration RationaleMigration RationaleMigration Rationale5Key Consideratio
2、nsUnified PlatformScalabilityFlexibilityCost efficiencyMigration Rationale6Performance BenchmarksPerformance test ThroughputQuery execution timeMigration PlanningMigration PlanningMigration Planning8Architecture AssessmentReview Snowflake SchemaReview Data distributionDefine Databricks target archit
3、ectureMigration Planning9Tools SelectionSchema Migration ToolData MigrationMetadata AlignmentMigration Planning10Risk mitigationIncremental Data migrationRobust ObservabilityData GovernanceImplementationImplementationImplementation12Data ExtractionExtraction as Batch+incremental hybridValidate Extra
4、cted FilesRun Data ProfilingImplementation13Data LoadingLoad manifest tracker and MonitoringBuild Idempotent loadingUse Delta Live tables(DLT)Implementation14Pipeline RefactoringIngestAuto loader or COPY INTO TransformPySpark with DLTSchedulingWorkflowsImplementation15Performance optimizationStorage
5、 LayoutQuery EngineCost-performance balanceChallenges and SolutionsChallenges and SolutionsChallenges and SolutionsData type IncompatibilitySchema Evolution17Schema CompatibilityPermission and GovernanceTable types and Storage differenceConstraints and KeysNULL and DefaultsChallenges and SolutionsPa
6、rtial LoadsPartial LoadsData OrderingData Ordering18Data ConsistencyMetadata Loss Metadata Loss ConcurrencyConcurrencyNULL HandlingNULL HandlingTimestamp DriftTimestamp DriftChallenges and SolutionsSQL FunctionsSQL FunctionsTime handlingTime handling19Query RewriteSecurity policySecurity policyWindo