The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning

From MaRDI portal
Revision as of 07:57, 8 February 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:4943730

DOI10.1137/S0363012997331639zbMath0990.62071MaRDI QIDQ4943730

Vivek S. Borkar, Sean P. Meyn

Publication date: 19 March 2000

Published in: SIAM Journal on Control and Optimization (Search for Journal in Brave)




Related Items (74)

Some Limit Properties of Markov Chains Induced by Recursive Stochastic AlgorithmsAccelerated and Instance-Optimal Policy Evaluation with Linear Function ApproximationAn online prediction algorithm for reinforcement learning with linear function approximation using cross entropy methodA new learning algorithm for optimal stoppingAn information-theoretic analysis of return maximization in reinforcement learningMultiscale Q-learning with linear function approximationA sojourn-based approach to semi-Markov reinforcement learningOnline calibrated forecasts: memory efficiency versus universality for learning in gamesLearning to control a structured-prediction decoder for detection of HTTP-layer DDoS attackersReinforcement learning based algorithms for average cost Markov decision processesAn adaptive optimization scheme with satisfactory transient performanceDistributed Stochastic Approximation with Local ProjectionsOja's algorithm for graph clustering, Markov spectral decomposition, and risk sensitive controlA Diffusion Approximation Theory of Momentum Stochastic Gradient Descent in Nonconvex OptimizationStochastic recursive inclusions with non-additive iterate-dependent Markov noiseA stability criterion for two timescale stochastic approximation schemesStochastic approximation with long range dependent and heavy tailed noiseAn actor-critic algorithm with function approximation for discounted cost constrained Markov decision processesError bounds for constant step-size \(Q\)-learningA Small Gain Analysis of Single Timescale Actor CriticRisk-Sensitive Reinforcement Learning via Policy Gradient SearchVariance-constrained actor-critic algorithms for discounted and average reward MDPsConvergence of stochastic approximation via martingale and converse Lyapunov methodsA Discrete-Time Switching System Analysis of Q-LearningUnnamed ItemOn the sample complexity of actor-critic method for reinforcement learning with function approximationGradient temporal-difference learning for off-policy evaluation using emphatic weightingsTarget Network and Truncation Overcome the Deadly Triad in \(\boldsymbol{Q}\)-LearningMulti-agent natural actor-critic reinforcement learning algorithmsA Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement LearningTwo-timescale stochastic gradient descent in continuous time with applications to joint online parameter estimation and optimal sensor placementAsymptotic bias of stochastic gradient searchAn online actor-critic algorithm with function approximation for constrained Markov decision processesApproachability in Stackelberg stochastic games with vector costsOn stochastic gradient and subgradient methods with adaptive steplength sequencesStabilization of stochastic approximation by step size adaptationTechnical Note—Consistency Analysis of Sequential Learning Under Approximate Bayesian InferenceReinforcement learning for long-run average cost.Deceptive Reinforcement Learning Under Adversarial Manipulations on Cost SignalsQ-learning for Markov decision processes with a satisfiability criterionIs Temporal Difference Learning Optimal? An Instance-Dependent AnalysisPopularity signals in trial-offer markets with social influence and position biasQuasi-Newton smoothed functional algorithms for unconstrained and constrained simulation optimizationSimultaneous perturbation Newton algorithms for simulation optimizationCharge-based control of DiffServ-like queuesThe Borkar-Meyn theorem for asynchronous stochastic approximationsConvergence and convergence rate of stochastic gradient search in the case of multiple and non-isolated extremaAvoidance of traps in stochastic approximationLinear stochastic approximation driven by slowly varying Markov chainsBoundedness of iterates in \(Q\)-learningA sensitivity formula for risk-sensitive cost and the actor-critic algorithmCooperative dynamics and Wardrop equilibriaEmpirical Dynamic ProgrammingOn the convergence of stochastic approximations under a subgeometric ergodic Markov dynamicMulti-armed bandits based on a variant of simulated annealingEvent-driven stochastic approximationA stochastic primal-dual method for optimization with conditional value at risk constraintsNonlinear GossipConcentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform samplingStochastic Recursive Inclusions in Two Timescales with Nonadditive Iterate-Dependent Markov NoiseModel-Free Reinforcement Learning for Stochastic Parity GamesRevisiting the ODE method for recursive algorithms: fast convergence using quasi stochastic approximationAn ODE method to prove the geometric convergence of adaptive stochastic algorithmsFinite-Time Performance of Distributed Temporal-Difference Learning with Linear Function ApproximationNatural actor-critic algorithmsA Finite Time Analysis of Temporal Difference Learning with Linear Function ApproximationNon-asymptotic error bounds for constant stepsize stochastic approximation for tracking mobile agentsWhat may lie ahead in reinforcement learningFundamental design principles for reinforcement learning algorithmsStability of annealing schemes and related processesFinite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learningConvergence of Recursive Stochastic Algorithms Using Wasserstein DivergenceAnalyzing Approximate Value Iteration AlgorithmsIterative learning control using faded measurements without system information: a gradient estimation approach






This page was built for publication: The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning