Framing reinforcement learning from human reward: reward positivity, temporal discounting, episodicity, and performance
From MaRDI portal
Publication:891791
DOI10.1016/J.ARTINT.2015.03.009zbMATH Open1343.68199OpenAlexW1453801241MaRDI QIDQ891791FDOQ891791
Authors: N. E. Zubov
Publication date: 17 November 2015
Published in: Artificial Intelligence (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.artint.2015.03.009
Recommendations
- Reward machines: exploiting reward function structure in reinforcement learning
- On average versus discounted reward temporal-difference learning
- Average reward reinforcement learning: foundations, algorithms, and empirical results
- Risk-sensitive reinforcement learning
- A survey of preference-based reinforcement learning methods
reinforcement learningend-user programminghuman teachershuman-agent interactioninteractive machine learningmodeling user behavior
Cites Work
Cited In (1)
This page was built for publication: Framing reinforcement learning from human reward: reward positivity, temporal discounting, episodicity, and performance
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q891791)