Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming
From MaRDI portal
Publication:1785275
DOI10.1016/j.orl.2014.07.006zbMath1408.90308OpenAlexW2088147131WikidataQ115038539 ScholiaQ115038539MaRDI QIDQ1785275
Bruno Scherrer, Eugene A. Feinberg, Jefferson Huang
Publication date: 28 September 2018
Published in: Operations Research Letters (Search for Journal in Brave)
Full work available at URL: https://hal.inria.fr/hal-01091370/file/Feinberg_Huang_Scherrer.pdf
Related Items (3)
Complexity bounds for approximately solving discounted MDPs by value iterations ⋮ Improved bound on the worst case complexity of policy iteration ⋮ On the reduction of total‐cost and average‐cost MDPs to discounted MDPs
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- The value iteration algorithm is not strongly polynomial for discounted dynamic programming
- A bound for the number of different basic solutions generated by the simplex method
- Strong polynomiality of policy iterations for average-cost MDPs modeling replacement and maintenance problems
- Solving H-horizon, stationary Markov decision problems in time proportional to log (H)
- The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate
- The Simplex Method is Strongly Polynomial for Deterministic Markov Decision Processes
- Modified Policy Iteration Algorithms for Discounted Markov Decision Problems
- Strategy Iteration Is Strongly Polynomial for 2-Player Turn-Based Stochastic Games with a Constant Discount Factor
This page was built for publication: Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming