An analysis of model-based interval estimation for Markov decision processes
DOI10.1016/j.jcss.2007.08.009zbMath1157.68059OpenAlexW1988526405MaRDI QIDQ959899
Alexander L. Strehl, Michael L. Littman
Publication date: 12 December 2008
Published in: Journal of Computer and System Sciences (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.jcss.2007.08.009
Markov processes: estimation; hidden Markov models (62M05) Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40) Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.) (68T20)
Related Items (7)
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Adaptive treatment allocation and the multi-armed bandit problem
- Bounded-parameter Markov decision processes
- Near-optimal reinforcement learning in polynomial time
- 10.1162/153244303765208377
- A Simple Distribution-Free Approach to the Max k-Armed Bandit Problem
- A theory of the learnable
- 10.1162/153244303321897663
- Robust Control of Markov Decision Processes with Uncertain Transition Matrices
This page was built for publication: An analysis of model-based interval estimation for Markov decision processes