A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs.

DOI10.14736/KYB-2019-1-0081MaRDI QIDQ5227201zbMATH OpenOpenAlexFDO

Authors Óscar Vega-Amaya, Joaquín López-Borbón

Publication date 5 August 2019

Published in Kybernetika (Search for Journal in Brave)

Full work available at URL http://hdl.handle.net/10338.dmlcz/147707

Markov decision processes approximate value iteration algorithm average cost criterion contraction and non-expansive operators perturbed Markov decision models

Mathematics Subject Classification ID

Approximation methods and heuristics in mathematical programming (90C59) Markov and semi-Markov decision processes (90C40) Optimal stochastic control (93E20)

Recommendations

A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces
Value iteration for average cost Markov decision processes in Borel spaces
Estimate and approximate policy iteration algorithm for discounted Markov decision models with bounded costs and Borel spaces
Policy iteration for average cost Markov control processes on Borel spaces
Value Iteration in a Class of Communicating Markov Decision Chains with the Average Cost Criterion

Cites work

scientific article; zbMATH DE number 3886197 (Why is no real title available?)
scientific article; zbMATH DE number 4061056 (Why is no real title available?)
scientific article; zbMATH DE number 1325008 (Why is no real title available?)
scientific article; zbMATH DE number 1321699 (Why is no real title available?)
scientific article; zbMATH DE number 1099381 (Why is no real title available?)
scientific article; zbMATH DE number 1999241 (Why is no real title available?)
scientific article; zbMATH DE number 837313 (Why is no real title available?)
scientific article; zbMATH DE number 5685899 (Why is no real title available?)
A Cost-Shaping Linear Program for Average-Cost Approximate Dynamic Programming with Performance Guarantees
A generalization of Ueno's inequality for n-step transition probabilities
A note on a variation of Doeblin's condition for uniform ergodicity of Markov chains
A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces
A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis
A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
An unbounded Berge's minimum theorem with applications to discounted Markov decision processes
Analysis of a Numerical Dynamic Programming Algorithm Applied to Economic Models
Application of average dynamic programming to inventory systems
Approximate Dynamic Programming
Approximate Fixed Point Iteration with an Application to Infinite Horizon Markov Decision Processes
Approximate policy iteration: a survey and some new methods
Approximate receding horizon approach for Markov decision processes: average reward case
Approximation of Markov decision processes with general state space
Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities
CONVERGENCE OF SIMULATION-BASED POLICY ITERATION
Continuous state dynamic programming via nonexpansive approximation
Convergence Properties of Policy Iteration
Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey
Discretization procedures for adaptive Markov control processes
Estimate and approximate policy iteration algorithm for discounted Markov decision models with bounded costs and Borel spaces
Infinite dimensional analysis. A hitchhiker's guide.
Learning algorithms for Markov decision processes with average cost
Markov chains and invariant probabilities
Markov chains and stochastic stability
On the asymptotic optimality of finite approximations to Markov decision processes with Borel spaces
On the existence of fixed points for approximate value iteration and temporal-difference learning
On the optimality equation for average cost Markov control processes with Feller transition probabilities
OnActor-Critic Algorithms
Performance Bounds in $L_p$‐norm for Approximate Value Iteration
Performance Loss Bounds for Approximate Value Iteration with State Aggregation
Perspectives of approximate dynamic programming
Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
Recurrence conditions for Markov decision processes with Borel state space: A survey
Sample-Path Optimality and Variance-Minimization of Average Cost Markov Control Processes
Simulation-based algorithms for Markov decision processes
Solutions of the average cost optimality equation for Markov decision processes with weakly continuous kernel: the fixed-point approach revisited
Stochastic approximations of constrained discounted Markov decision processes
The approximation of continuous functions by positive linear operators
The average cost optimality equation: a fixed point approach
What you should know about approximate dynamic programming
Zero-Sum Average Semi-Markov Games: Fixed-Point Solutions of the Shapley Equation

Cited in

(6)

Estimate and approximate policy iteration algorithm for discounted Markov decision models with bounded costs and Borel spaces
Analyzing approximate value iteration algorithms
A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces
Performance Bounds in $L_p$‐norm for Approximate Value Iteration
Value iteration in average cost Markov control processes on Borel spaces
Value iteration for average cost Markov decision processes in Borel spaces

This page was built for publication: A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs.

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5227201)