On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies
From MaRDI portal
Publication:5704236
DOI10.1287/moor.1050.0148zbMath1082.90131OpenAlexW2136937392MaRDI QIDQ5704236
Shie Mannor, John N. Tsitsiklis
Publication date: 11 November 2005
Published in: Mathematics of Operations Research (Search for Journal in Brave)
Full work available at URL: https://semanticscholar.org/paper/1c93280ca091393d0653ef1e21874f761d6b3653
Markov and semi-Markov decision processes (90C40) Convergence of probability measures (60B10) Limit theorems in probability theory (60F99)
Related Items (6)
Acceptable strategy profiles in stochastic games ⋮ Simulation-based optimization of Markov decision processes: an empirical process theory approach ⋮ Fluctuation Bounds for the Max-Weight Policy with Applications to State Space Collapse ⋮ Stationary anonymous sequential games with undiscounted rewards ⋮ NP-hardness of checking the unichain condition in average cost MDPs ⋮ Fast convergence to state-action frequency polytopes for MDPs
This page was built for publication: On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies