On the Convergence of Stochastic Iterative Dynamic Programming Algorithms

From MaRDI portal
Publication:4323346

DOI10.1162/neco.1994.6.6.1185zbMath0822.68095OpenAlexW2165131254MaRDI QIDQ4323346

Michael I. Jordan, Tommi S. Jaakkola, Satinder Pal Singh

Publication date: 18 October 1995

Published in: Neural Computation (Search for Journal in Brave)

Full work available at URL: http://hdl.handle.net/1721.1/7205




Related Items (43)

Some Limit Properties of Markov Chains Induced by Recursive Stochastic AlgorithmsApproximate policy iteration: a survey and some new methodsCooperation between independent market makersPerspectives of approximate dynamic programmingRestricted gradient-descent algorithm for value-function approximation in reinforcement learningLinear least-squares algorithms for temporal difference learningOn the worst-case analysis of temporal-difference learning algorithmsReinforcement learning with replacing eligibility tracesQ-learning and policy iteration algorithms for stochastic shortest path problemsA Q-Learning Algorithm for Discrete-Time Linear-Quadratic Control with Random Parameters of Unknown Distribution: Convergence and StabilizationError bounds for constant step-size \(Q\)-learningA Discrete-Time Switching System Analysis of Q-LearningA novel policy based on action confidence limit to improve exploration efficiency in reinforcement learningUnnamed ItemThe optimal unbiased value estimator and its relation to LSTD, TD and MCApproximate Q Learning for Controlled Diffusion Processes and Its Near OptimalityTarget Network and Truncation Overcome the Deadly Triad in \(\boldsymbol{Q}\)-LearningA lexicographic optimization approach for a bi-objective parallel-machine scheduling problem minimizing total quality loss and total tardinessStochastic Fixed-Point Iterations for Nonexpansive Maps: Convergence and Error BoundsReinforcement learning algorithms with function approximation: recent advances and applicationsTechnical Note—Consistency Analysis of Sequential Learning Under Approximate Bayesian InferenceA unified framework for stochastic optimizationSOLVING DYNAMIC WILDLIFE RESOURCE OPTIMIZATION PROBLEMS USING REINFORCEMENT LEARNINGREINFORCEMENT LEARNING WITH GOAL-DIRECTED ELIGIBILITY TRACESDeep Reinforcement Learning: A State-of-the-Art WalkthroughFull Gradient DQN Reinforcement Learning: A Provably Convergent SchemeAsymptotic analysis of temporal-difference learning algorithms with constant step-sizesAdaptive Learning Algorithm Convergence in Passive and Reactive EnvironmentsIs Temporal Difference Learning Optimal? An Instance-Dependent AnalysisAsymptotic analysis of temporal-difference learning algorithms with constant step-sizesThe asymptotic equipartition property in reinforcement learning and its relation to return maximizationAdaptive stock trading with dynamic asset allocation using reinforcement learningAn optimal control approach to mode generation in hybrid systemsTD(λ) learning without eligibility traces: a theoretical analysisStochastic approximation algorithms: overview and recent trends.Convergence of least squares learning in self-referential discontinuous stochastic models.Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk MeasuresBayesian Exploration for Approximate Dynamic ProgrammingRevisiting the ODE method for recursive algorithms: fast convergence using quasi stochastic approximationReinforcement distribution in fuzzy Q-learningEmpirical Q-Value IterationA simulation-based approach to stochastic dynamic programmingStochastic adaptation of importance sampler



Cites Work


This page was built for publication: On the Convergence of Stochastic Iterative Dynamic Programming Algorithms