《赞助商:Impetus Technologies面向未来的大规模数据:Shutterfly 如何实现 GenAI 驱动的个性化.pdf》由会员分享,可在线阅读,更多相关《赞助商:Impetus Technologies面向未来的大规模数据:Shutterfly 如何实现 GenAI 驱动的个性化.pdf(17页珍藏版)》请在三个皮匠报告上搜索。
1、Modernization for the GenAI Era:How Shutterfly Built a Future-Ready Data FoundationWe make lifes experiences unforgettableShutterfly UC Migration Business Goals:4Migrate to UC for improved governanceBuild a platform that enables AI use casesReduce maintenance workManual migration required increased
2、resources,time and fund allocationLimitations:Shutterfly UC Migration OutcomeImpetus delivered a successful UC migration through 3 phases:Assessment:Created a clear project with defined timelines and resource needsMigration:Impetus Unity Catalog Migration AcceleratorOperationalization:Conducted thor
3、ough testing and updated downstream applications56Glue vs.Unity CatalogCriteria/Key DiffGlueUnity CatalogExt.Table Partition DiscoveryManual(msck repair/add partition)AutomaticDrop Table(partitioned)Time ConsumingInstantaneousSchema Evolution SupportPartial(add columns towards the end)FullGovernance
4、/Access ControlLake Formation/IAM changesNative/ACL commandsData LineageNot SupportedSupportedAI MonitoringNot SupportedSupportedCostCatalog+API costsNo Cost7UC Migration Steps InvolvedInventory GatheringUC EnablementAttach metastoreto each workspaceCatalog(s)to workspace bindingUser Identity Manage
5、mentSet up account level SCIMGroup MigrationObject MigrationCreate UC ArtifactsDefine Catalog SchemasCluster and Code AdjustmentsDecommission GlueDownstream UpgradeWarehouse UpgradeVisualization ToolsTable UpgradeSourcesIngestStoreProcessEnrichServeAWS CloudInternalRDS Oracle3rdParty DataProvidersAW
6、S GlueDatabricksData LakehouseScheduled data ingestionusing Python Frameworkvia AWS Glue/DatabricksIntraday Micro Batch/Nightly Bulk ExtractData WarehouseAmazon RedshiftRaw Data(Unprocessed)CSV/JSON/XMLConformed(Cleaned/Deduped)ParquetProcessed(Transformed/Aggregated)Parquet/DeltaData LakehouseOrche