报告预览

毛宇航_RLChina23 - 周日上午 - 毛航宇 - 从强化学习(多)智能体到大语言模型(多)智能体(1)_watermark.pdf

编号：155525

PDF 35页 2.73MB 下载积分：VIP专享

下载报告请您先登录！

毛宇航_RLChina23 - 周日上午 - 毛航宇 - 从强化学习(多)智能体到大语言模型(多)智能体(1)_watermark.pdf

1、从强化学习(多)智能体到大语言模型(多)智能体1毛航宇商汤科技RLChina2023 “大模型与AI Agent”目录SEIHAINeurIPS20,DAI21TIT/PDiTSubmit to AAMAS24Arxiv 22.12TPTUNeurIPS24-FMDMTPTU-V2Arxiv 23.11Gated-ACMLAAAI20NCC-MARLAAAI20STEERSubmit to AAAI24Arxiv 23.05LLaMACArxiv 23.11SingleMultiDRL TRL LLM-based Agent SEIHAISEIHAI:A Sample-efficient

2、Hierarchical AI for the MineRL CompetitionMotivation验证agents在Open-ende环境中的不断学习能力成为AI的一个重要方向MineCraft成为天然的“演练场”SEIHAI是第一个在NeurIPS MineRLCompetition中完全learning-based达到“铁器时代”的agentMineCraft难点item依赖、稀疏奖励+长episode、无任何语义SEIHAISEIHAI:A Sample-efficient Hierarchical AI for the MineRL Competitiontraining the

3、 scheduler boils down to a classification taskSEIHAISEIHAI:A Sample-efficient Hierarchical AI for the MineRL Competition目录SEIHAINeurIPS20,DAI21TIT/PDiTSubmit to AAMAS24Arxiv 22.12TPTUNeurIPS24-FMDMTPTU-V2Arxiv 23.11Gated-ACMLAAAI20NCC-MARLAAAI20STEERSubmit to AAAI24Arxiv 23.05LLaMACArxiv 23.11Single

4、MultiDRL TRL LLM-based Agent Gated-ACMLLearning Agent Communication under Limited Bandwidth by Message PruningMotivationMulti-agent communication是个很古老的研究课题，研究how、what、to whom to communicate但实际问题中通信带宽有限，如何在limited bandwidth下进行通信？Gated-ACMLLearning Agent Communication under Limited Bandwidth by Messag

5、e Pruning如何设置T=动态（如下图）和静态（？）将limited bandwidth转化为message pruningmessage pruning转化为binary classification如何设置T=动态（如下图）和静态（？）Gated-ACMLLearning Agent Communication under Limited Bandwidth by Message PruningNCC-MARLNeighborhood Cognition Consistent Multi-Agent Reinforcement LearningMotivationMulti-agent

6、怎么才能像人一样很好的合作？人在合作时有什么特性？认知一致性！NCC-MARLNeighborhood Cognition Consistent Multi-Agent Reinforcement LearningNCC-MARLNeighborhood Cognition Consistent Multi-Agent Reinforcement Learning一致性近似变分推理目录SEIHAINeurIPS20,DAI21TIT/PDiTSubmit to AAMAS24Arxiv 22.12TPTUNeurIPS24-FMDMTPTU-V2Arxiv 23.11Gated-ACMLAAA

友情提示

1、下载报告失败解决办法
2、PDF文件下载后，可能会被浏览器默认打开，此种情况可以点击浏览器菜单，保存网页到桌面，就可以正常下载了。
3、本站不支持迅雷下载，请使用电脑自带的IE浏览器，或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩，下载后原文更清晰。

本文（毛宇航_RLChina23 - 周日上午 - 毛航宇 - 从强化学习(多)智能体到大语言模型(多)智能体(1)_watermark.pdf）为本站（张5G）主动上传，三个皮匠报告文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若此文所含内容侵犯了您的版权或隐私，请立即通知三个皮匠报告文库（点击联系客服），我们立即给予删除！

温馨提示：如果因为网速或其他原因下载失败请重新下载，重复下载不扣分。

毛宇航_RLChina23 - 周日上午 - 毛航宇 - 从 强化学习(多)智能体 到 大语言模型(多)智能体(1)_watermark.pdf

毛宇航_RLChina23 - 周日上午 - 毛航宇 - 从 强化学习(多)智能体 到 大语言模型(多)智能体(1)_watermark.pdf

毛宇航_RLChina23 - 周日上午 - 毛航宇 - 从强化学习(多)智能体到大语言模型(多)智能体(1)_watermark.pdf

毛宇航_RLChina23 - 周日上午 - 毛航宇 - 从强化学习(多)智能体到大语言模型(多)智能体(1)_watermark.pdf