scientific article; zbMATH DE number 5957492
From MaRDI portal
Publication:3174155
zbMATH Open1222.68253MaRDI QIDQ3174155FDOQ3174155
Authors: Sridhar Mahadevan, Mauro Maggioni
Publication date: 12 October 2011
Full work available at URL: http://www.jmlr.org/papers/v8/mahadevan07a.html
Title of this publication is not available (Why is that?)
Recommendations
- Learning Representation and Control in Markov Decision Processes: New Frontiers
- scientific article; zbMATH DE number 1509479
- Value function based reinforcement learning in changing Markovian environments
- Efficient reinforcement learning in deterministic systems with value function generalization
- \(L^\ast\)-based learning of Markov decision processes (extended version)
- Proximal gradient temporal difference learning: stable reinforcement learning with polynomial sample complexity
- Policy gradient in Lipschitz Markov decision processes
- From reinforcement learning to optimal control: a unified framework for sequential decisions
- Learning control of finite Markov chains with an explicit trade-off between estimation and control
spectral graph theorymanifold learningMarkov decision processesvalue function approximationreinforcement learning
Cited In (16)
- Online reinforcement learning for condition-based group maintenance using factored Markov decision processes
- Markov reward models and Markov decision processes in discrete and continuous time: performance evaluation and optimization
- Investigating the properties of neural network representations in reinforcement learning
- Multi-scale geometric methods for data sets. II: Geometric multi-resolution analysis
- Optimal Curiosity-Driven Modular Incremental Slow Feature Analysis
- Diffusion wavelets
- Automatic complexity reduction in reinforcement learning
- A sufficient statistic for influence in structured multiagent environments
- Adaptive critic design with graph Laplacian for online learning control of nonlinear systems
- Regularized feature selection in reinforcement learning
- Continual curiosity-driven skill acquisition from high-dimensional video inputs for humanoid robots
- A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
- Reinforcement learning for robust adaptive control of partially unknown nonlinear systems subject to unmatched uncertainties
- Actor-critic algorithms with online feature adaptation
- Slowness as a proxy for temporal predictability: an empirical comparison
- Reinforcement learning algorithms with function approximation: recent advances and applications
This page was built for publication:
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3174155)