Performance Bounds in L_p‐norm for Approximate Value Iteration
DOI10.1137/040614384zbMATH Open1356.90159OpenAlexW2012547817MaRDI QIDQ5453575FDOQ5453575
Authors: Rémi Munos
Publication date: 3 April 2008
Published in: SIAM Journal on Control and Optimization (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1137/040614384
Recommendations
- A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces
- A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs.
- Performance Loss Bounds for Approximate Value Iteration with State Aggregation
- Finite-time bounds for fitted value iteration
- Analyzing approximate value iteration algorithms
dynamic programmingstatistical learningerror analysisoptimal controlMarkov decision processesreinforcement learningfunction approximation
Approximation methods and heuristics in mathematical programming (90C59) Dynamic programming in optimal control and differential games (49L20) Markov and semi-Markov decision processes (90C40) Optimal stochastic control (93E20)
Cited In (9)
- Quadratic approximate dynamic programming for input‐affine systems
- Settling the sample complexity of model-based offline reinforcement learning
- Transfer learning for contextual multi-armed bandits
- A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs
- A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces
- Performance Loss Bounds for Approximate Value Iteration with State Aggregation
- A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning
- Exploiting action impact regularity and exogenous state variables for offline reinforcement learning
- Multi-agent reinforcement learning: a selective overview of theories and algorithms
This page was built for publication: Performance Bounds in $L_p$‐norm for Approximate Value Iteration
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5453575)