Performance Bounds in L_p‐norm for Approximate Value Iteration
From MaRDI portal
Publication:5453575
Recommendations
- A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces
- A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs.
- Performance Loss Bounds for Approximate Value Iteration with State Aggregation
- Finite-time bounds for fitted value iteration
- Analyzing approximate value iteration algorithms
Cited in
(11)- Quadratic approximate dynamic programming for input-affine systems
- Analyzing approximate value iteration algorithms
- Settling the sample complexity of model-based offline reinforcement learning
- Transfer learning for contextual multi-armed bandits
- A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces
- A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs
- Performance Loss Bounds for Approximate Value Iteration with State Aggregation
- A unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learning
- Exploiting action impact regularity and exogenous state variables for offline reinforcement learning
- Finite-time bounds for fitted value iteration
- Multi-agent reinforcement learning: a selective overview of theories and algorithms
This page was built for publication: Performance Bounds in $L_p$‐norm for Approximate Value Iteration
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5453575)