An empirical study of policy convergence in Markov decision process value iteration
From MaRDI portal
Publication:1886733
DOI10.1016/S0305-0548(03)00207-7zbMath1076.90066OpenAlexW2094964720MaRDI QIDQ1886733
William T. Scherer, Christopher W. Zobel
Publication date: 19 November 2004
Published in: Computers \& Operations Research (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/s0305-0548(03)00207-7
Related Items
Cites Work
- Unnamed Item
- Unnamed Item
- Dynamic programming and stochastic control
- Geometric bounds for eigenvalues of Markov chains
- The convergence of value iteration in discounted Markov decision processes
- Time will tell: behavioural scoring and the dynamics of consumer credit assessment
- Finding Optimal Survey Policies via Adaptive Markov Decision Processes
- A New Value Iteration method for the Average Cost Dynamic Programming Problem
This page was built for publication: An empirical study of policy convergence in Markov decision process value iteration