An information-theoretic analysis of return maximization in reinforcement learning
From MaRDI portal
(Redirected from Publication:2375396)
Recommendations
- The asymptotic equipartition property in reinforcement learning and its relation to return maximization
- An information-theoretic analysis of Thompson sampling
- Reinforcement learning in finite MDPs: PAC analysis
- Near-optimal reinforcement learning in polynomial time
- Using Expectation-Maximization for Reinforcement Learning
Cites work
- scientific article; zbMATH DE number 3126094 (Why is no real title available?)
- scientific article; zbMATH DE number 3148886 (Why is no real title available?)
- scientific article; zbMATH DE number 3908323 (Why is no real title available?)
- scientific article; zbMATH DE number 1043533 (Why is no real title available?)
- scientific article; zbMATH DE number 1179314 (Why is no real title available?)
- scientific article; zbMATH DE number 1821199 (Why is no real title available?)
- A Mathematical Theory of Communication
- A New Optimality Criterion for Nonhomogeneous Markov Decision Processes
- A simple proof of the Moy-Perez generalization of the Shannon-McMillan theorem
- Approximation theory of output statistics
- Asymptotically mean stationary measures
- Asynchronous stochastic approximation and Q-learning
- Boundedness of iterates in \(Q\)-learning
- Convergence results for single-step on-policy reinforcement-learning algorithms
- Correction Notes: Correction to "The Individual Ergodic Theorem of Information Theory"
- Discrete Dynamic Programming
- Elements of Information Theory
- Generalizations of Shannon-McMillan theorem
- Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation
- The Basic Theorems of Information Theory
- The Individual Ergodic Theorem of Information Theory
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- The asymptotic equipartition property in reinforcement learning and its relation to return maximization
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
- The method of types [information theory]
- The role of the asymptotic equipartition property in noiseless source coding
- The strong ergodic theorem for densities: Generalized Shannon-McMillan- Breiman theorem
- \({\mathcal Q}\)-learning
Cited in
(3)
This page was built for publication: An information-theoretic analysis of return maximization in reinforcement learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2375396)