Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities
From MaRDI portal
Publication:5265786
Recommendations
- Approximation of average cost optimal policies for general Markov decision processes with unbounded costs
- Exact finite approximations of average-cost countable Markov decision processes
- Average cost Markov decision processes with weakly continuous transition probabilities
- Average cost Markov decision processes under the hypothesis of Doeblin
- Finite-State Approximations to Discounted and Average Cost Constrained Markov Decision Processes
- Average cost Markov decision processes with semi-uniform Feller transition probabilities
- Average cost Markov decision processes: Optimality conditions
- The convergence of value iteration in average cost Markov decision chains
Cites work
- scientific article; zbMATH DE number 1321699 (Why is no real title available?)
- A policy improvement method for constrained average Markov decision processes
- A time aggregation approach to Markov decision processes
- Approximate Dynamic Programming
- Approximate gradient methods in policy-space optimization of Markov reward processes
- Approximate receding horizon approach for Markov decision processes: average reward case
- Approximation of Markov decision processes with general state space
- Average cost Markov control processes: Stability with respect to the Kantorovich metric
- Average optimality for Markov decision processes in borel spaces: a new condition and approach
- CONVERGENCE OF SIMULATION-BASED POLICY ITERATION
- Convergence Results for Some Temporal Difference Methods Based on Least Squares
- Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey
- Finite linear programming approximations of constrained discounted Markov decision processes
- Learning algorithms for Markov decision processes with average cost
- OnActor-Critic Algorithms
- Policy iteration for average cost Markov control processes on Borel spaces
- Simple bounds for the convergence of empirical and occupation measures in 1-Wasserstein distance
- Simulation-based optimization of Markov reward processes
- Universal Reinforcement Learning
Cited in
(11)- The average cost of Markov chains subject to total variation distance uncertainty
- Robustness to incorrect models and data-driven learning in average-cost optimal stochastic control
- A stability result for linear Markovian stochastic optimization problems
- Computable approximations for average Markov decision processes in continuous time
- Optimal deterministic controller synthesis from steady-state distributions
- A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs
- A convex optimization approach to dynamic programming in continuous state and action spaces
- From infinite to finite programs: explicit error bounds with applications to approximate dynamic programming
- Computable approximations for continuous-time Markov decision processes on Borel spaces based on empirical measures
- Approximation of discounted minimax Markov control problems and zero-sum Markov games using Hausdorff and Wasserstein distances
- Performance guarantees for empirical Markov decision processes with applications to multiperiod inventory models
This page was built for publication: Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5265786)