The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

DOI10.1287/moor.1110.0516zbMath1245.90140OpenAlexW1984901446MaRDI QIDQ2884291

Publication date: 24 May 2012

Published in: Mathematics of Operations Research (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1287/moor.1110.0516

zbMATH Keywords

iteration method Markov decision problem strongly polynomial time simplex method policy

Mathematics Subject Classification ID

Analysis of algorithms and problem complexity (68Q25) Linear programming (90C05) Dynamic programming (90C39) Markov and semi-Markov decision processes (90C40)

Related Items (40)

A complexity analysis of policy iteration through combinatorial matrices arising from unique sink orientations ⋮ The greedy strategy for optimizing the Perron eigenvalue ⋮ More bounds on the diameters of convex polytopes ⋮ On Augmentation Algorithms for Linear and Integer-Linear Programming: From Edmonds--Karp to Bland and Beyond ⋮ Improved and Generalized Upper Bounds on the Complexity of Policy Iteration ⋮ Continue, quit, restart probability model ⋮ The Simplex Method is Strongly Polynomial for Deterministic Markov Decision Processes ⋮ The value iteration algorithm is not strongly polynomial for discounted dynamic programming ⋮ Complexity bounds for approximately solving discounted MDPs by value iterations ⋮ Reformulation of the linear program for completely ergodic MDPs with average cost criteria ⋮ The stochastic shortest path problem: a polyhedral combinatorics perspective ⋮ On the Number of Solutions Generated by the Simplex Method for LP ⋮ Random search for constrained Markov decision processes with multi-policy improvement ⋮ Uniform Turnpike Theorems for Finite Markov Decision Processes ⋮ Dual Ascent and Primal-Dual Algorithms for Infinite-Horizon Nonstationary Markov Decision Processes ⋮ Unnamed Item ⋮ Joint chance-constrained Markov decision processes ⋮ Strong polynomiality of the Gass-Saaty shadow-vertex pivoting rule for controlled random walks ⋮ Randomized Linear Programming Solves the Markov Decision Problem in Nearly Linear (Sometimes Sublinear) Time ⋮ On the number of solutions generated by the dual simplex method ⋮ A Friendly Smoothed Analysis of the Simplex Method ⋮ A double-pivot simplex algorithm and its upper bounds of the iteration numbers ⋮ Towards solving 2-TBSG efficiently ⋮ Strong polynomiality of policy iterations for average-cost MDPs modeling replacement and maintenance problems ⋮ Optimal schedulers vs optimal bases: an approach for efficient exact solving of Markov decision processes ⋮ What Tropical Geometry Tells Us about the Complexity of Linear Programming ⋮ The double pivot simplex method ⋮ Policy iteration for robust nonstationary Markov decision processes ⋮ Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming ⋮ A primal-simplex based Tardos' algorithm ⋮ Improved bound on the worst case complexity of policy iteration ⋮ Computing Kitahara-Mizuno's bound on the number of basic feasible solutions generated with the simplex algorithm ⋮ Comments on: Recent progress on the combinatorial diameter of polytopes and simplicial complexes ⋮ Variance Reduced Value Iteration and Faster Algorithms for Solving Markov Decision Processes ⋮ Efficient computation of a canonical form for a matrix with the generalized P-property ⋮ The simplex method using Tardos' basic algorithm is strongly polynomial for totally unimodular LP under nondegeneracy assumption ⋮ The operator approach to entropy games ⋮ On the reduction of total‐cost and average‐cost MDPs to discounted MDPs ⋮ Multiply Accelerated Value Iteration for NonSymmetric Affine Fixed Point Problems and Application to Markov Decision Processes ⋮ A scaling-invariant algorithm for linear programming whose running time depends only on the constraint matrix

This page was built for publication: The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate