Framing reinforcement learning from human reward: reward positivity, temporal discounting, episodicity, and performance
From MaRDI portal
Publication:891791
Recommendations
- Reward machines: exploiting reward function structure in reinforcement learning
- On average versus discounted reward temporal-difference learning
- Average reward reinforcement learning: foundations, algorithms, and empirical results
- Risk-sensitive reinforcement learning
- A survey of preference-based reinforcement learning methods
Cites work
Cited in
(2)
This page was built for publication: Framing reinforcement learning from human reward: reward positivity, temporal discounting, episodicity, and performance
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q891791)