scientific article

From MaRDI portal

Revision as of 21:49, 3 February 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:3093180

Jump to:navigation, search

zbMath1222.68196MaRDI QIDQ3093180

Yishay Mansour, Eyal Even-Dar

Publication date: 12 October 2011

Full work available at URL: http://www.jmlr.org/papers/v5/evendar03a.html

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.

zbMATH Keywords

stochastic processes reinforcement learning Q-learning learning rates convergence bounds

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Dynamic programming (90C39) Markov chains (discrete-time Markov processes on discrete state spaces) (60J10)

Related Items (22)

Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming ⋮ Unified reinforcement Q-learning for mean field game and control problems ⋮ Reinforcement learning-based design of side-channel countermeasures ⋮ Reinforcement learning with algorithms from probabilistic structure estimation ⋮ Error bounds for constant step-size \(Q\)-learning ⋮ A concentration bound for \(\operatorname{LSPE}( \lambda )\) ⋮ A Discrete-Time Switching System Analysis of Q-Learning ⋮ Recent advances in reinforcement learning in finance ⋮ A stochastic contraction mapping theorem ⋮ Settling the sample complexity of model-based offline reinforcement learning ⋮ Integrated condition-based maintenance and multi-item lot-sizing with stochastic demand ⋮ Cooperative and geometric learning algorithm (CGLA) for path planning of UAVs with limited information ⋮ Unnamed Item ⋮ Empirical Dynamic Programming ⋮ Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures ⋮ Fundamental design principles for reinforcement learning algorithms ⋮ Convergence Rates and Decoupling in Linear Stochastic Approximation Algorithms ⋮ Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning ⋮ Mean-Field Controls with Q-Learning for Cooperative MARL: Convergence and Complexity Analysis ⋮ Speedy Categorical Distributional Reinforcement Learning and Complexity Analysis ⋮ One-dimensional system arising in stochastic gradient descent ⋮ Concentration of Contractive Stochastic Approximation and Reinforcement Learning

This page was built for publication:

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:3093180&oldid=16168698"