Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning (Q5219302): Difference between revisions

Revision as of 01:28, 22 July 2024

scientific article; zbMATH DE number 7179328

Language	Label	Description	Also known as
English	Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning	scientific article; zbMATH DE number 7179328

Statements

instance of

scholarly article

0 references

title

Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning (English)

0 references

0 references

0 references

Mathematics of Operations Research

0 references

publication date

11 March 2020

0 references

full work available at URL

https://arxiv.org/abs/1503.09105

0 references

zbMATH Keywords

Markov noise

0 references

two time-scale stochastic approximation

0 references

asymptotic convergence

0 references

temporal-difference learning

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Q3324260

0 references

Q4938927

0 references

Stochastic Approximations and Differential Inclusions

0 references

Q3997575

0 references

Q4858374

0 references

Stochastic approximation with two time scales

0 references

Stochastic approximation with `controlled Markov' noise

0 references

Linear stochastic approximation driven by slowly varying Markov chains

0 references

OnActor-Critic Algorithms

0 references

Stochastic approximations for finite-state Markov chains

0 references

Basis function adaptation in temporal difference reinforcement learning

0 references

Applications of a Kushner and Clark lemma to general classes of stochastic algorithms

0 references

Q5526189

0 references

Convergence and convergence rate of stochastic gradient search in the case of multiple and non-isolated extrema

0 references

Least Squares Temporal Difference Methods: An Analysis under General Conditions

0 references

Q2953645

0 references

Identifiers

zbMATH Open document ID

1434.62174

0 references

DOI

10.1287/moor.2017.0855

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:5219302

@@ Property / cites work @@
+Q3324260
@@ Property / cites work: Q3324260 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4938927
@@ Property / cites work: Q4938927 / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic Approximations and Differential Inclusions
+Normal rank
@@ Property / cites work @@
+Q3997575
@@ Property / cites work: Q3997575 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4858374
@@ Property / cites work: Q4858374 / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic approximation with two time scales
@@ Property / cites work: Stochastic approximation with two time scales / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic approximation with `controlled Markov' noise
+Normal rank
@@ Property / cites work @@
+Linear stochastic approximation driven by slowly varying Markov chains
+Normal rank
@@ Property / cites work @@
+OnActor-Critic Algorithms
@@ Property / cites work: OnActor-Critic Algorithms / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic approximations for finite-state Markov chains
+Normal rank
@@ Property / cites work @@
+Basis function adaptation in temporal difference reinforcement learning
+Normal rank
@@ Property / cites work @@
+Applications of a Kushner and Clark lemma to general classes of stochastic algorithms
+Normal rank
@@ Property / cites work @@
+Q5526189
@@ Property / cites work: Q5526189 / rank @@
+Normal rank
@@ Property / cites work @@
+Convergence and convergence rate of stochastic gradient search in the case of multiple and non-isolated extrema
+Normal rank
@@ Property / cites work @@
+Least Squares Temporal Difference Methods: An Analysis under General Conditions
+Normal rank
@@ Property / cites work @@
+Q2953645
@@ Property / cites work: Q2953645 / rank @@
+Normal rank