Accelerated modified policy iteration algorithms for Markov decision processes
From MaRDI portal
Publication:2391867
DOI10.1007/S00186-013-0432-YzbMATH Open1273.90234arXiv0806.0320OpenAlexW2081102656WikidataQ115149137 ScholiaQ115149137MaRDI QIDQ2391867FDOQ2391867
Authors: Oleksandr Shlakhter, Chi-Guhn Lee
Publication date: 5 August 2013
Published in: Mathematical Methods of Operations Research (Search for Journal in Brave)
Abstract: One of the most widely used methods for solving average cost MDP problems is the value iteration method. This method, however, is often computationally impractical and restricted in size of solvable MDP problems. We propose acceleration operators that improve the performance of the value iteration for average reward MDP models. These operators are based on two important properties of Markovian operator: contraction mapping and monotonicity. It is well known that the classical relative value iteration methods for average cost criteria MDP do not involve the max-norm contraction or monotonicity property. To overcome this difficulty we propose to combine acceleration operators with variants of value iteration for stochastic shortest path problems associated average reward problems.
Full work available at URL: https://arxiv.org/abs/0806.0320
Recommendations
- Acceleration Operators in the Value Iteration Algorithms for Markov Decision Processes
- Multiply accelerated value iteration for nonsymmetric affine fixed point problems and application to Markov decision processes
- Monotone value iteration for discounted finite Markov decision processes
- Generic rank-one corrections for value iteration in Markovian decision problems
- A \(K\)-step look-ahead analysis of value iteration algorithms for Markov decision processes
Cites Work
- Title not available (Why is that?)
- The Linear Programming Approach to Approximate Dynamic Programming
- Title not available (Why is that?)
- Modified Policy Iteration Algorithms for Discounted Markov Decision Problems
- Value iteration and optimization of multiclass queueing networks
- Accelerating Procedures of the Value Iteration Algorithm for Discounted Markov Decision Processes, Based on a One-Step Lookahead Analysis
- Acceleration Operators in the Value Iteration Algorithms for Markov Decision Processes
- Title not available (Why is that?)
- A modified dynamic programming method for Markovian decision problems
- Algorithms for Stochastic Games with Geometrical Interpretation
- Technical Note—Accelerated Computation of the Expected Discounted Return in a Markov Chain
- A \(K\)-step look-ahead analysis of value iteration algorithms for Markov decision processes
- Generic rank-one corrections for value iteration in Markovian decision problems
Cited In (11)
- Truncated policy iteration methods
- Acceleration Operators in the Value Iteration Algorithms for Markov Decision Processes
- Title not available (Why is that?)
- Policy iteration type algorithms for recurrent state Markov decision processes
- Generic rank-one corrections for value iteration in Markovian decision problems
- Interval iteration algorithm for MDPs and IMDPs
- An Accelerated Value/Policy Iteration Scheme for Optimal Control Problems and Games
- Policy iteration accelerated with Krylov methods
- Prioritization methods for accelerating MDP solvers
- Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes
- On iterative optimization ol structured Markov decision processes with discounted rewards
This page was built for publication: Accelerated modified policy iteration algorithms for Markov decision processes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2391867)