Asynchronous stochastic approximation and Q-learning

From MaRDI portal
Publication:1345139

zbMath0820.68105MaRDI QIDQ1345139

John N. Tsitsiklis

Publication date: 26 February 1995

Published in: Machine Learning (Search for Journal in Brave)




Related Items

Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storageSome Limit Properties of Markov Chains Induced by Recursive Stochastic AlgorithmsAn information-theoretic analysis of return maximization in reinforcement learningMultiscale Q-learning with linear function approximationValue iteration and adaptive dynamic programming for data-driven adaptive optimal control designAsynchronous stochastic approximation with differential inclusionsOptimal Hour-Ahead Bidding in the Real-Time Electricity Market with Battery Storage Using Approximate Dynamic ProgrammingOnline calibrated forecasts: memory efficiency versus universality for learning in gamesPerspectives of approximate dynamic programmingActor-critic algorithms for hierarchical Markov decision processesLinear least-squares algorithms for temporal difference learningFeature-based methods for large scale dynamic programmingReinforcement learning with replacing eligibility tracesThe loss from imperfect value functions in exceptation-based and minimax-based tasksAsymptotics of Reinforcement Learning with Neural NetworksAn adaptive learning model with foregone payoff informationQ-learning and policy iteration algorithms for stochastic shortest path problemsA Q-Learning Algorithm for Discrete-Time Linear-Quadratic Control with Random Parameters of Unknown Distribution: Convergence and StabilizationFictitious Play in Zero-Sum Stochastic GamesNeural circuits for learning context-dependent associations of stimuliStochastic approximation with two time scalesError bounds for constant step-size \(Q\)-learningApproximate stochastic annealing for online control of infinite horizon Markov decision processesA Discrete-Time Switching System Analysis of Q-LearningUnnamed ItemOn the sample complexity of actor-critic method for reinforcement learning with function approximationApproximate Q Learning for Controlled Diffusion Processes and Its Near OptimalityTarget Network and Truncation Overcome the Deadly Triad in \(\boldsymbol{Q}\)-LearningA stochastic contraction mapping theoremOptimal liquidation through a limit order book: a neural network and simulation approachStochastic Fixed-Point Iterations for Nonexpansive Maps: Convergence and Error BoundsReinforcement learning algorithms with function approximation: recent advances and applicationsUnderestimation estimators to Q-learningIndependent learning in stochastic gamesIterative learning control for large scale nonlinear systems with observation noiseNew algorithms of the Q-learning typeStabilization of stochastic approximation by step size adaptationGeneralization of a result of Fabian on the asymptotic normality of stochastic approximationTechnical Note—Consistency Analysis of Sequential Learning Under Approximate Bayesian Inference$Q$-Learning in a Stochastic Stackelberg Game between an Uninformed Leader and a Naive FollowerA unified framework for stochastic optimizationReinforcement learning for long-run average cost.Adaptive dynamic programming and optimal control of nonlinear nonaffine systemsOn Generalized Bellman Equations and Temporal-Difference LearningQ-learning algorithms with random truncation bounds and applications to effective parallel computingFull Gradient DQN Reinforcement Learning: A Provably Convergent SchemeQ-learning for continuous-time linear systems: A model-free infinite horizon optimal control approachThe asymptotic equipartition property in reinforcement learning and its relation to return maximizationStructural estimation of real options modelsAn optimal control approach to mode generation in hybrid systemsBoundedness of iterates in \(Q\)-learningThe actor-critic algorithm as multi-time-scale stochastic approximation.Stochastic approximation algorithms: overview and recent trends.An Approximate Dynamic Programming Algorithm for Monotone Value FunctionsA parallel scheduling algorithm for reinforcement learning in large state spaceContinuous-Time Robust Dynamic ProgrammingRisk-Averse Approximate Dynamic Programming with Quantile-Based Risk MeasuresBayesian Exploration for Approximate Dynamic ProgrammingConvergence results on stochastic adaptive learningRevisiting the ODE method for recursive algorithms: fast convergence using quasi stochastic approximationReinforcement learning and stochastic optimisationAn application of approximate dynamic programming in multi-period multi-product advertising budgetingOn Convergence of Value Iteration for a Class of Total Cost Markov Decision ProcessesNatural actor-critic algorithmsUnnamed ItemUnnamed ItemFundamental design principles for reinforcement learning algorithmsEmpirical Q-Value IterationFinite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learningA simulation-based approach to stochastic dynamic programmingA Gentle Introduction to Reinforcement Learning