Discounting, Ergodicity and Convergence for Markov Decision Processes
From MaRDI portal
Publication:4132287
Cited in
(13)- A survey of algorithmic methods for partially observed Markov decision processes
- Computation techniques for large scale undiscounted markov decision processes
- Improved iterative computation of the expected discounted return in Markov and semi-Markov chains
- Serial and parallel value iteration algorithms for discounted Markov decision processes
- The infinite horizon non-stationary stochastic inventory problem: Near myopic policies and weak ergodicity
- Contraction mappings underlying undiscounted Markov decision problems
- Decision and forecast horizons in a stochastic environment: A survey
- The rate of convergence for backwards products of a convergent sequence of finite Markov matrices
- Solving linear systems by methods based on a probabilistic interpretation
- The method of value oriented successive approximations for the average reward Markov decision process
- Action-dependent stopping times and Markov decision process with unbounded rewards
- Sensitivity analysis in discrete dynamic programming
- Periodic review stochastic inventory problem with forecast updates: Worst-case bounds for the myopic solution
This page was built for publication: Discounting, Ergodicity and Convergence for Markov Decision Processes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4132287)