Pages that link to "Item:Q4323346"
From MaRDI portal
The following pages link to On the Convergence of Stochastic Iterative Dynamic Programming Algorithms (Q4323346):
Displaying 43 items.
- Perspectives of approximate dynamic programming (Q333093) (← links)
- Q-learning and policy iteration algorithms for stochastic shortest path problems (Q378731) (← links)
- The optimal unbiased value estimator and its relation to LSTD, TD and MC (Q415609) (← links)
- Reinforcement learning algorithms with function approximation: recent advances and applications (Q903601) (← links)
- Reinforcement distribution in fuzzy Q-learning (Q1037957) (← links)
- A unified framework for stochastic optimization (Q1719609) (← links)
- Linear least-squares algorithms for temporal difference learning (Q1911340) (← links)
- On the worst-case analysis of temporal-difference learning algorithms (Q1911342) (← links)
- Reinforcement learning with replacing eligibility traces (Q1911343) (← links)
- Error bounds for constant step-size \(Q\)-learning (Q1932736) (← links)
- Revisiting the ODE method for recursive algorithms: fast convergence using quasi stochastic approximation (Q2070010) (← links)
- Restricted gradient-descent algorithm for value-function approximation in reinforcement learning (Q2389624) (← links)
- The asymptotic equipartition property in reinforcement learning and its relation to return maximization (Q2488678) (← links)
- Adaptive stock trading with dynamic asset allocation using reinforcement learning (Q2499055) (← links)
- An optimal control approach to mode generation in hybrid systems (Q2499632) (← links)
- A simulation-based approach to stochastic dynamic programming (Q2863720) (← links)
- Approximate policy iteration: a survey and some new methods (Q2887629) (← links)
- Stochastic adaptation of importance sampler (Q3143505) (← links)
- TD(λ) learning without eligibility traces: a theoretical analysis (Q4421245) (← links)
- Bayesian Exploration for Approximate Dynamic Programming (Q4971589) (← links)
- (Q4998920) (← links)
- Some Limit Properties of Markov Chains Induced by Recursive Stochastic Algorithms (Q5037552) (← links)
- Cooperation between independent market makers (Q5051973) (← links)
- A Q-Learning Algorithm for Discrete-Time Linear-Quadratic Control with Random Parameters of Unknown Distribution: Convergence and Stabilization (Q5093265) (← links)
- Technical Note—Consistency Analysis of Sequential Learning Under Approximate Bayesian Inference (Q5130497) (← links)
- Deep Reinforcement Learning: A State-of-the-Art Walkthrough (Q5145831) (← links)
- Full Gradient DQN Reinforcement Learning: A Provably Convergent Scheme (Q5153609) (← links)
- Adaptive Learning Algorithm Convergence in Passive and Reactive Environments (Q5157257) (← links)
- Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis (Q5162625) (← links)
- Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures (Q5219554) (← links)
- SOLVING DYNAMIC WILDLIFE RESOURCE OPTIMIZATION PROBLEMS USING REINFORCEMENT LEARNING (Q5697240) (← links)
- REINFORCEMENT LEARNING WITH GOAL-DIRECTED ELIGIBILITY TRACES (Q5699354) (← links)
- Empirical Q-Value Iteration (Q5856670) (← links)
- Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes (Q5898263) (← links)
- Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes (Q5920615) (← links)
- Stochastic approximation algorithms: overview and recent trends. (Q5955825) (← links)
- Convergence of least squares learning in self-referential discontinuous stochastic models. (Q5956277) (← links)
- A Discrete-Time Switching System Analysis of Q-Learning (Q6107867) (← links)
- A novel policy based on action confidence limit to improve exploration efficiency in reinforcement learning (Q6121659) (← links)
- Approximate Q Learning for Controlled Diffusion Processes and Its Near Optimality (Q6136230) (← links)
- Target Network and Truncation Overcome the Deadly Triad in \(\boldsymbol{Q}\)-Learning (Q6148353) (← links)
- A lexicographic optimization approach for a bi-objective parallel-machine scheduling problem minimizing total quality loss and total tardiness (Q6164626) (← links)
- Stochastic Fixed-Point Iterations for Nonexpansive Maps: Convergence and Error Bounds (Q6180255) (← links)