0
点赞
收藏
分享

微信扫一扫

Loss is its own Reward: Self-Supervision for Reinforcement Learning

夕颜合欢落 2022-07-18 阅读 77

作者用action, reward, state等当做lalbel,进行有监督训练。

 

黄世宇/Shiyu Huang's Personal Page:​​https://huangshiyu13.github.io/​​



举报

相关推荐

0 条评论