1、语音助手中的NLP技术应用与研究张帆 小米 高级算法工程师|01Conversational AI Agent02CONTENT|XiaoAI Model Pipeline03Self-Learning|01Conversational AI AgentConversational AI Agent|ComponentInputOutputExampleAutomatic Speech Recognition(ASR)SpeechText(1-best or n-best)“播放他的青花瓷”Natural Language Understanding(NLU)TextSlots&IntentI
2、ntent:PlayMusicSlots:Anaphor=他,Song=青花瓷Dialogue State Tracking(DST)Context&Slots&IntentSlots&IntentIntent:PlayMusicSlots:Artist=周杰伦,Song=青花瓷Rankingn-best Slots&IntentSlots&Intent最优语义选择SkillSlots&IntentText执行播放音乐&回复Text-to-Speech(TTS)TextSpeech“好的,为你播放周杰伦的青花瓷”Conversational AI Agent|Turn 1:-Text:播放周董
3、的青花瓷-Domain=Music,Intent=PlayMusic,Artist=周董,Song=青花瓷-Domain=Video,Intent=PlayVideo,Artist=周董,MV=青花瓷Turn 2:-Text:播放他的滑如雪-Domain=Music,Intent=PlayMusic,Artist=周董,Song=滑如雪Turn 3:-Text:是发如雪-Domain=Music,Intent=PlayMusic,Artist=周董,Song=发如雪User:播放周董的青花瓷User:播放他的滑如雪User:是发如雪Agent:好的Agent:未找到,请问想播放什么?|Inte
4、nt Classification and Slot FillingInput-Utterance,Phoneme-Bo1 fang4 qing1 hua1 ci2-Knowledge Info-Song:青花瓷Knowledge Enhanced Multi-task ModelModel DetailsModel-Knowledge Encoder-Pre-train Bert Encoder-Feature Fusion LayerMulti-task heads-Intent classification-Slot filling(CRF layer)Task:-Text:播放周董的青
5、花瓷-Intent=PlayMusic,Slot:Artist=周董,Song=青花瓷Entity resolution|Input-Continuous Features-Age,Time-Categorical Features-User:Device-Entity:Id,Name,Genre,SingerTask:-Text:播放青花瓷-Intent=PlayMusic,Song=青花瓷,Entity=青花瓷(id,周杰伦)Conversational AI Agent|Abstract Dialog FlowUser:打电话给张三User:不对是李四User:确定Agent:好的,第几
6、个?User:好的,确定拨打么?Conversational AI Agent|ContactsPhonecallDatabaseAPIMakecallSimulatorDialogues about PhoncallConversational AI Agent|Acharya,Anish,et al.Alexa Conversations:An Extensible Data-driven Approach for Building Task-oriented Dialogue Systems.Proceedings of the 2021 Conference of the North