An information-theoretic analysis of return maximization in reinforcement learning
From MaRDI portal
Publication:2375396
DOI10.1016/j.neunet.2011.05.002zbMath1266.68156OpenAlexW2034994237WikidataQ51559078 ScholiaQ51559078MaRDI QIDQ2375396
Publication date: 14 June 2013
Published in: Neural Networks (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.neunet.2011.05.002
information theoryreinforcement learningasymptotic equipartition propertystochastic sequential decision process
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- A Mathematical Theory of Communication
- Generalizations of Shannon-McMillan theorem
- The strong ergodic theorem for densities: Generalized Shannon-McMillan- Breiman theorem
- Asymptotically mean stationary measures
- Asynchronous stochastic approximation and Q-learning
- Convergence results for single-step on-policy reinforcement-learning algorithms
- \({\mathcal Q}\)-learning
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
- A simple proof of the Moy-Perez generalization of the Shannon-McMillan theorem
- The asymptotic equipartition property in reinforcement learning and its relation to return maximization
- Boundedness of iterates in \(Q\)-learning
- The Individual Ergodic Theorem of Information Theory
- Correction Notes: Correction to "The Individual Ergodic Theorem of Information Theory"
- A New Optimality Criterion for Nonhomogeneous Markov Decision Processes
- Approximation theory of output statistics
- The role of the asymptotic equipartition property in noiseless source coding
- The method of types [information theory]
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- Discrete Dynamic Programming
- Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation
- Elements of Information Theory
- The Basic Theorems of Information Theory
This page was built for publication: An information-theoretic analysis of return maximization in reinforcement learning