《产业应用论坛-字节跳动在大模型同传开发中的正确和错误-字节跳动.pdf》由会员分享,可在线阅读,更多相关《产业应用论坛-字节跳动在大模型同传开发中的正确和错误-字节跳动.pdf(20页珍藏版)》请在三个皮匠报告上搜索。
1、Inspiration thats Infinite字节跳动在模型同传开发中的正确和错误程善伯Inspiration thats Infinite评测(%):讲者的真实意图有多少被正确翻译了(准确性)且户易理解(易读性)接近甚超过类的同传智能体Inspiration thats Infinite正确1:激进的标Inspiration thats Infinite-以中-英为例,我们的标:英语语者(不懂中)使我们的系统来翻译,开了1时的会议,有80%以上的信息被准确传达了注重实际效果,不要嗨平类同传专家评测,在真实的,难度的内部会议中,商业系统只能做到20%+的准确率,只有达到70%以上,才算作
2、合格的同传Inspiration thats Infinite正确2:的变化Inspiration thats Infinite读写策略Liu,Xiaoqian,Guoqiang Hu,Yangfan Du,Erfeng He,YingFeng Luo,Chen Xu,Tong Xiao,and Jingbo Zhu.Recent Advances in End-to-End Simultaneous Speech Translation.arXiv preprint arXiv:2406.00497(2024).翻译能+Inspiration thats Infinite传统机器翻译模型翻译
3、为设计的Policy数据中学习级联系统端到端系统Inspiration thats Infinite正确3:质量 延迟Inspiration thats InfiniteInspiration thats InfiniteInspiration thats Infinite错误1:雕花Inspiration thats Infinite消耗了巨的A100 GPU卡时组更新各种Ablation study细微的实验2024年初Inspiration thats Infinite评测只+5%常微的提升!结果Inspiration thats Infinite错误2:动评测Inspiration t
4、hats Infinite多个测试集,动评测均显著增加没有评测监督过程 最后把梭哈评测39%-44%等于没怎么涨Inspiration thats Infinite动评测 vs 评测Inspiration thats Infinite正在犯的错误:停规模化,数据标注质量变差Inspiration thats Infinite制定激进的标敢做的变化过程中做出正确的判断拒绝雕花Inspiration thats InfinitePublications(2024)1 Xingyuan Pan,Luyang Huang,Liyan Kang,Zhicheng Liu,Yu Lu,and Shanbo
5、 Cheng.2024.G-DIG:Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation.In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers),pages 1539515406,Bangkok,Thailand.Association for Computational Lingui
6、stics.ACL 2024 Outstanding Paper2 Zhichao Huang,Chutong Meng,and Tom Ko.2024.RepCodec:A Speech Representation Codec for Speech Tokenization.InProceedings of the 62nd Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers),pages 57