General time consistent discounting
From MaRDI portal
Abstract: A possibly immortal agent tries to maximise its summed discounted rewards over time, where discounting is used to avoid infinite utilities and encourage the agent to value current rewards more than future ones. Some commonly used discount functions lead to time-inconsistent behavior where the agent changes its plan over time. These inconsistencies can lead to very poor behavior. We generalise the usual discounted utility model to one where the discount function changes with the age of the agent. We then give a simple characterisation of time-(in)consistent discount functions and show the existence of a rational policy for an agent that knows its discount function is time-inconsistent.
Recommendations
Cites work
- scientific article; zbMATH DE number 4078557 (Why is no real title available?)
- scientific article; zbMATH DE number 3638998 (Why is no real title available?)
- scientific article; zbMATH DE number 1321699 (Why is no real title available?)
- A course in game theory.
- Asymptotically efficient adaptive allocation rules
- Consistent Plans
- General Discounting Versus Average Reward
- On the Existence of a Consistent Course of Action when Tastes are Changing
- Self-Optimizing and Pareto-Optimal Policies in General Environments based on Bayes-Mixtures
- Stationary Ordinal Utility and Impatience
- Subgame-perfect equilibria of finite- and infinite-horizon games
- Universal artificial intelligence. Sequential decisions based on algorithmic probability.
Cited in
(7)- Optimal discounting
- New discounting functions
- Time consistent discounting
- Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective
- Extreme state aggregation beyond Markov decision processes
- Information, inattention, perception, and discounting
- On the computability of Solomonoff induction and AIXI
This page was built for publication: General time consistent discounting
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q391749)