On the possibility of learning in reactive environments with arbitrary dependence
From MaRDI portal
Publication:950202
DOI10.1016/J.TCS.2008.06.039zbMATH Open1158.68039arXiv0810.5636OpenAlexW1969028245WikidataQ58012401 ScholiaQ58012401MaRDI QIDQ950202FDOQ950202
Publication date: 22 October 2008
Published in: Theoretical Computer Science (Search for Journal in Brave)
Abstract: We address the problem of reinforcement learning in which observations may exhibit an arbitrary form of stochastic dependence on past observations and actions, i.e. environments more general than (PO)MDPs. The task for an agent is to attain the best possible asymptotic reward where the true generating environment is unknown but belongs to a known countable family of environments. We find some sufficient conditions on the class of environments under which an agent exists which attains the best asymptotic reward for any environment in the class. We analyze how tight these conditions are and how they relate to different probabilistic assumptions known in reinforcement learning and related fields, such as Markov Decision Processes and mixing conditions.
Full work available at URL: https://arxiv.org/abs/0810.5636
Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40)
Cites Work
- Prediction, Learning, and Games
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Information Theory and Statistics: A Tutorial
- Nonparametric statistics for stochastic processes
- Universal artificial intelligence. Sequential decisions based on algorithmic probability.
- Self-Optimizing and Pareto-Optimal Policies in General Environments based on Bayes-Mixtures
- General Discounting Versus Average Reward
- 10.1162/1532443041827952
- Algorithmic Learning Theory
- Algorithmic Learning Theory
- Predicting non-stationary processes
Cited In (4)
Uses Software
Recommendations
This page was built for publication: On the possibility of learning in reactive environments with arbitrary dependence
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q950202)