夕颜合欢落

关注

Loss is its own Reward: Self-Supervision for Reinforcement Learning

夕颜合欢落

关注

阅读 79

2022-07-18

作者用action, reward, state等当做lalbel,进行有监督训练。

 

黄世宇/Shiyu Huang's Personal Page:​​https://huangshiyu13.github.io/​​



相关推荐

书呆鱼

领域自适应论文(六十五):Unsupervised Domain Adaptation through Self-Supervision论文原理

书呆鱼 77 0 0

北溟有渔夫

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

北溟有渔夫 61 0 0

雪域迷影

reward model learning papers

雪域迷影 77 0 0

幸福的无所谓

Playing FPS Games with Deep Reinforcement Learning

幸福的无所谓 64 0 0

汤姆torn

Incentivizing exploration in reinforcement learning with deep predictive models

汤姆torn 52 0 0

书呆鱼

论文阅读-Policy Optimization for Continuous Reinforcement Learning

书呆鱼 6 0 0

曾宝月

《Reinforcement Learning: An Introduction》第8章笔记

曾宝月 37 0 0

phpworkerman

从baselines库的common/vec_env/vec_normalize.py看reinforcement learning算法中的reward shape方法

phpworkerman 36 0 0

sin信仰

(论文阅读笔记)Network planning with deep reinforcement learning

sin信仰 102 0 0

王小沫

【ICCV 2015】Active Object Localization with Deep Reinforcement Learning

王小沫 29 0 0

精彩评论(0)

0 0 举报