The method of value oriented successive approximations for the average reward Markov decision process
From MaRDI portal
Publication:1144501
DOI10.1007/BF01719500zbMath0443.90109OpenAlexW2123114919MaRDI QIDQ1144501
Publication date: 1980
Published in: OR Spektrum (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/bf01719500
convergenceaverage rewardfinite action spacefinite state spacealmost optimal solutionsvalue oriented successive approximations
Numerical mathematical programming methods (65K05) Markov and semi-Markov decision processes (90C40)
Related Items
The numerical exploitation of periodicity in Markov decision processes, A value iteration method for undiscounted multichain Markov decision processes, MARKOV DECISION PROCESSES
Cites Work
- A successive approximation algorithm for an undiscounted Markov decision process
- Dynamic programming, Markov chains, and the method of successive approximations
- Iterative solution of the functional equations of undiscounted Markov renewal programming
- Discounting, Ergodicity and Convergence for Markov Decision Processes
- A set of successive approximation methods for discounted Markovian decision problems
- Technical Note—Improved Conditions for Convergence in Undiscounted Markov Renewal Programming
- Geometric convergence of value-iteration in multichain Markov decision problems
- The Asymptotic Behavior of Undiscounted Value Iteration in Markov Decision Problems
- Optimal decision procedures for finite Markov chains. Part II: Communicating systems
- Technical Note—The Method of Successive Approximations and Markovian Decision Problems
- On Finding the Maximal Gain for Markov Decision Processes
- Technical Note—Bounds on the Gain of a Markov Decision Process
- Technical Note—Undiscounted Markov Renewal Programming Via Modified Successive Approximations
- Some Bounds for Discounted Sequential Decision Processes
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item