FastSpeech:高效语音合成的算法设计及优化.pdf

编号:29529 PDF 36页 1.51MB 下载积分:VIP专享
下载报告请您先登录!

FastSpeech:高效语音合成的算法设计及优化.pdf

1、MicrosoftNVIDIA.FastSpeech: Algorithm and Optimizationfor State-of-the-art Text to SpeechXu Tan & Dabi AhnMicrosoft Research Asia & NVIDIA#page#NVIDIA.MicrosoftOutlineThe algorithm of FastSpeechBy Xu Tan, Microsoft Research AsiaThe optimization of FastSpeechBy Dabi Ahn NVIDIA#page#MicrosoftAbout tex

2、t to speech systemFaest spiychFastSpeechTTSMel-spectrogramPhonemeAcousticVocoderF SpeechText ModelFrontend#page#MicrosoftAbout FastSpeechA fast robust controllable high-quality and end-to-end text to speech(TTS) systemFastSpeech: Fast, Robust and Controllable Text to Speech, NeuriPS 2019 1 FastSpeec

3、h 2: Fast and High-Quality End-to-End Text to Speech,ICLR 2021submission 2Widely supported by the community and deployedin Microsoft AzureTTS service to support all the languages1 https/proceedings.neurips.cc/paper/2019/file/f63f65b503e22cb970527f23c9ad7dbl-Paperpdf2 https:/ these issues?Slow infere

4、nce speedAutoregressive generationsInference time depends on sequence length (for 5s speech mellength is about 500)Not robustEncoder-decoder attention is not accurate, repeating and skipping attentionLack of controllability No control information as inputAutoregressive generation cannot explicitly c

5、ontrol the duration#page#MicrosoftOur solution: FastSpeechKey designsGenerate melspectrogram in parallel (forspeedup)s Remove the attention mechanism between text and speech (for robustness) Variance adaptor introduces duration pitch energy (for controllability)FastSpeech has the following advantage

6、sx8Euolelaug ueodsu uo dnpaadsaualayu xzseuxspeedup on voice generation!Robust: no bad case of word skipping and repeating Controllable: can controlvoice speed and prosody.Voicequaliity:on par or better than SOTA model#page#MicrosoftOur solution: FastSpeechFeed-forward transformer: generate mel-spec

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(FastSpeech:高效语音合成的算法设计及优化.pdf)为本站 (X-iao) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
折叠