An analysis of model-based interval estimation for Markov decision processes

From MaRDI portal

Publication:959899

Jump to:navigation, search

DOI10.1016/j.jcss.2007.08.009zbMath1157.68059OpenAlexW1988526405MaRDI QIDQ959899

Alexander L. Strehl, Michael L. Littman

Publication date: 12 December 2008

Published in: Journal of Computer and System Sciences (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1016/j.jcss.2007.08.009

zbMATH Keywords

Markov decision processes learning theory reinforcement learning

Mathematics Subject Classification ID

Markov processes: estimation; hidden Markov models (62M05) Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40) Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.) (68T20)

Related Items (7)

Adaptive aggregation for reinforcement learning in average reward Markov decision processes ⋮ Identity concealment games: how I learned to stop revealing and love the coincidences ⋮ Unnamed Item ⋮ Bayesian optimistic Kullback-Leibler exploration ⋮ Near-optimal PAC bounds for discounted MDPs ⋮ Deep Reinforcement Learning: A State-of-the-Art Walkthrough ⋮ Unnamed Item

Uses Software

R-MAX

Cites Work

This page was built for publication: An analysis of model-based interval estimation for Markov decision processes

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:959899&oldid=12938268"