Loss is its own Reward: Self-Supervision for Reinforcement Learning-CFANZ编程社区

Loss is its own Reward: Self-Supervision for Reinforcement Learning

夕颜合欢落

阅读 86

2022-07-18

作者用action, reward, state等当做lalbel，进行有监督训练。

黄世宇/Shiyu Huang's Personal Page：https://huangshiyu13.github.io/

相关推荐
北溟有渔夫
 Deep Reinforcement Learning from Self-Play in Imperfect-Information Games
北溟有渔夫 64 0 0
书呆鱼
 领域自适应论文（六十五）：Unsupervised Domain Adaptation through Self-Supervision论文原理
书呆鱼 80 0 0
雪域迷影
 reward model learning papers
雪域迷影 81 0 0
幸福的无所谓
 Playing FPS Games with Deep Reinforcement Learning
幸福的无所谓 66 0 0
汤姆torn
 Incentivizing exploration in reinforcement learning with deep predictive models
汤姆torn 55 0 0
曾宝月
 《Reinforcement Learning: An Introduction》第8章笔记
曾宝月 39 0 0
书呆鱼
 论文阅读-Policy Optimization for Continuous Reinforcement Learning
书呆鱼 10 0 0
phpworkerman
 从baselines库的common/vec_env/vec_normalize.py看reinforcement learning算法中的reward shape方法
phpworkerman 37 0 0
王小沫
 【ICCV 2015】Active Object Localization with Deep Reinforcement Learning
王小沫 34 0 0
胡桑_b06e
 深度学习之：强化学习 Reinforcement Learning
胡桑_b06e 85 0 0

精彩评论（0）