Multiscale Q-learning with linear function approximation
From MaRDI portal
Publication:312650
DOI10.1007/s10626-015-0216-zzbMath1346.93265OpenAlexW2194349390MaRDI QIDQ312650
Publication date: 16 September 2016
Published in: Discrete Event Dynamic Systems (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s10626-015-0216-z
differential inclusionstochastic approximationordinary differential equationreinforcement learningmulti-stage stochastic shortest path problemQ-learning with linear function approximation
Learning and adaptive systems in artificial intelligence (68T05) Time-scale analysis and singular perturbations in control/observation systems (93C70) Stochastic systems in control theory (general) (93E03)
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- An online actor-critic algorithm with function approximation for constrained Markov decision processes
- Stochastic recursive algorithms for optimization. Simultaneous perturbation methods
- A one-measurement form of simultaneous perturbation stochastic approximation
- Nonconvergence to unstable points in urn models and stochastic approximations
- Natural actor-critic algorithms
- Stochastic approximation methods for constrained and unconstrained systems
- Asynchronous stochastic approximation and Q-learning
- Stochastic approximation with two time scales
- Average cost temporal-difference learning
- \({\mathcal Q}\)-learning
- New algorithms of the Q-learning type
- Reinforcement learning based algorithms for average cost Markov decision processes
- Learning Algorithms for Markov Decision Processes with Average Cost
- A simple dynamic routing problem
- Recursive Stochastic Algorithms for Global Optimization in $\mathbb{R}^d $
- Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
- On the optimal assignment of customers to parallel servers
- Some Pathological Traps for Stochastic Approximation
- An analysis of temporal-difference learning with function approximation
- OnActor-Critic Algorithms
- Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences
- Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization
- Adaptive Newton-based multivariate smoothed functional algorithms for simulation optimization
- Actor-Critic--Type Learning Algorithms for Markov Decision Processes
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- A Simultaneous Perturbation Stochastic Approximation-Based Actor–Critic Algorithm for Markov Decision Processes
- Stochastic Approximations and Differential Inclusions
- Stochastic Approximations and Differential Inclusions, Part II: Applications
- Q-Learning with Linear Function Approximation
- Multiscale Stochastic Approximation for Parametric Optimization of Hidden Markov Models
- Perturbation theory and finite Markov chains