1、2024 Databricks Inc.All rights reservedVenkat Viswanathan(AWS)and Mayur Rajdev(The Baldwin Group)Venkat Viswanathan(AWS)and Mayur Rajdev(The Baldwin Group)June 11June 11thth202420241Augmenting Generative Augmenting Generative AI for complex AI for complex Unstructured DataUnstructured Data2024 Datab
2、ricks Inc.All rights reserved2Complex Unstructured DataFormsLoan applicationComplex PDFsSpecialized documentsInvoices and receiptsVendorLending documentsW21003Identity documentsTablesHandwritingSignature2024 Databricks Inc.All rights reserved3Industry Challenges when it comes to Complex Document Pro
3、cessingLanguage is not a barrier to run your business any moreManual ExtractionScalabilityExpensiveInconsistent OutputTime to MarketDecreased Employee ProductivityMulti-Lingual SupportLack of AutomationClassificationPII/PHI SecurityContinuous Training2024 Databricks Inc.All rights reservedSend infor
4、mation to downstream apps or databasesData CaptureAggregating and organizing documents from different business workflow(s)within your organizationIf more than one document type then classify each document and send to the appropriate document pipeline.Extract key business informationGetting insights
5、and business value from your dataRun business rules on your data and/or include human in the loop validation as needed.ClassificationExtractionEnrichmentReview and ValidationReady for Business Decisions4Stages of Complex Document ProcessingBusiness decision activities:Comparing documents with baseli
6、ne,identifying gaps,and creating insights2024 Databricks Inc.All rights reservedKey customer challenges5Data CaptureExtractionValidationClassificationEnrichmentSecurely capture dataAuto-correct for quality defects distortion,dirt,rotated text,etc.Capture data structures(tables,key-value pairs,entiti