《许华哲_robot_with_foundation_models_watermark.pdf》由会员分享,可在线阅读,更多相关《许华哲_robot_with_foundation_models_watermark.pdf(56页珍藏版)》请在三个皮匠报告上搜索。
1、Learning Robotic Manipulation by Leveraging Foundation ModelsHuazhe XuTsinghua University,IIISShanghai Qi Zhi InstituteRobotic Manipulation2How to get the tasks in simulation?How to define reward or goal?Reinforcement learning is widely used!How to get policy weights?Todays agenda Defining visual go
2、als with diffusion models(LfVoid)Obtaining tasks in simulators with LLMs(GenSim)Initializing policy weights with LMs(LaMo)3Todays agenda Defining visual goals with diffusion models(LfVoid)Obtaining simulators with LLMs(GenSim)Initializing policy weights with LMs(LaMo)4Goals from diffusion modelsExpe
3、rt demonstrationsVisual Goals from HumanKnowledge from pre-trained modelCostly to collectLearning from the Void(LfVoid)Can Pre-Trained Text-to-Image Models Generate Visual Goals for Reinforcement Learning?Jialu Gao*,Kaizhe Hu*,Guowei Xu,Huazhe XuMethodInitial SceneGenerated GoalsOverview of LfVoidMe
4、thodOverview of LfVoidMethodFeature extracting module-a special token sks that captures the visual features of objects-high resemblance of the edited images to the source images.MethodInversion module-uncovers the diffusion process-fine-grained control to achieve accurate editing.MethodEditing modul
5、e:Appearance-based editing is achieved through attention maps injection from the diffusion process of the input image to that of the edited image.MethodEditing module:Structure-based editing is achieved through attention maps injection as well as attention maps strengthening and weakening based on t
6、he given bounding-box to achieve object displacement.MethodOverview of LfVoidMethodExample-based visual RL-LfVoid learns a discriminator-generated goal images as positive samples-the observation images sampled during random exploration as negative samples.-logits of discriminator for the positive cl