Fast convergence to state-action frequency polytopes for MDPs
From MaRDI portal
Recommendations
- On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies
- On the Convergence of Policy Iteration in Finite State Undiscounted Markov Decision Processes: The Unichain Case
- Achieving Target State-Action Frequencies in Multichain Average-Reward Markov Decision Processes
- Convergence of Markov decision processes with constraints and state-action dependent discount factors
- A Note on the Convergence of Policy Iteration in Markov Decision Processes with Compact Action Spaces
Cites work
- scientific article; zbMATH DE number 700091 (Why is no real title available?)
- scientific article; zbMATH DE number 3240812 (Why is no real title available?)
- Finite state Markovian decision processes
- Hoeffding's inequality for uniformly ergodic Markov chains
- On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies
- Rate of Convergence of Empirical Measures and Costs in Controlled Markov Chains and Transient Optimality
Cited in
(3)
This page was built for publication: Fast convergence to state-action frequency polytopes for MDPs
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1015315)