1、2024 Databricks Inc.All rights reservedAna Paula Oliveira BertholdoAna Paula Oliveira BertholdoJune 10June 10-13,202413,20241TRANSCRIBED AUDIO AUTOMATIC CHECKLIST BASED ON GEN AIGEN AI2024 Databricks Inc.All rights reservedPhD in Computer ScienceUniversity of Sao Paulo(USP)Information TechnologistCa
2、mpinas State University(UNICAMP)Over 17 years of work experience Data ScientistBrasilprevAbout meAbout .br2024 Databricks Inc.All rights reservedBrasilprevBrasilprev3The leading private pension company in BrazilThe leading private pension company in Brazil2024 Databricks Inc.All rights reservedThe C
3、RC receives an average of 1000 audio files per day.Each audio has an average duration of 10 minutes.Evaluators listen to the entire audio to identify items related to the customer service protocol.ScenarioScenario4Customer Relationship Center(CRC)Customer Relationship Center(CRC)2024 Databricks Inc.
4、All rights reservedEvaluators conduct random audio checksMost audios are not evaluatedHuman evaluators listen to complete audiosAudios are reviewed by evaluators who listen to the entire call in a non-automated manner and without integration into databasesProblemProblem52024 Databricks Inc.All right
5、s reservedEvaluation of all audio recordings from the CRC,assessing adherence to the service quality protocol for each call.There is no need to listen to the entire audio.GoalGoal6Automated checklist for transcribed audio based on Generative AIAutomated checklist for transcribed audio based on Gener
6、ative AI2024 Databricks Inc.All rights reservedLong audio transcription pipeline in Azure DatabricksAutomated checklist for transcribed audio Performance/cost comparison of LLMs with and without Vector Search(BGE)MethodMethod72024 Databricks Inc.All rights reserved8MethodMethodCognitive services Cog