微软:2021 GPT-3赋能数据标注成本降低研究报告(英文版)(11页).pdf

编号:909792 PDF  中文版  DOCX 11页 5.55MB 下载积分:VIP专享
下载报告请您先登录!

1、Want To Reduce Labeling Cost?GPT-3 Can HelpShuohang WangYang LiuYichong XuChenguang ZhuMichael ZengMicrosoft Cognitive Services Research Groupshuowa,yaliu10,yicxu,chezhu,AbstractData annotation is a time-consuming andlabor-intensive process for many NLP tasks.Although there exist various methods to

2、pro-duce pseudo data labels,they are often task-specificandrequireadecentamountoflabeleddata to start with.Recently,the immense lan-guage model GPT-3 with 175 billion param-eters has achieved tremendous improvementacross many few-shot learning tasks.In thispaper,we explore ways to leverage GPT-3 asa

3、 low-cost data labeler to train other models.We find that,to make the downstream modelachieve the same performance on a variety ofNLU and NLG tasks,it costs 50%to 96%less to use labels from GPT-3 than using la-bels from humans.Furthermore,we propose anovel framework of combining pseudo labelsfrom GP

4、T-3 with human labels,which leads toeven better performance with limited labelingbudget.These results present a cost-effectivedata labeling methodology that is generaliz-able to many practical applications.1IntroductionData always plays a crucial role in developing ma-chine learning models.However,c

5、ollecting human-labeled data is a costly and time-consuming pro-cess,especially in multi-task scenarios.With thesuccess of pre-trained models(Zhang et al.,2020;Raffel et al.,2020;Liu et al.,2019;Devlin et al.,2019)on unlabeled data,the performance of mod-els under few-shot and zero-shot settings has

6、 beengreatly enhanced.In particular,the large-scale lan-guage model GPT-3(Brown et al.,2020),with 175billion parameters,is the state-of-the-art few shotlearner on many NLP tasks.However,GPT-3 is constrained on its immensemodel size and requires a large amount of resourceto be deployed for real appli

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(微软:2021 GPT-3赋能数据标注成本降低研究报告(英文版)(11页).pdf)为本站 (111111) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
客服
商务合作
小程序
服务号
折叠