An emphatic approach to the problem of off-policy temporal-difference learning (Q2810885)

scientific article; zbMATH DE number 6589487

Language	Label	Description	Also known as
default for all languages	No label defined
English	An emphatic approach to the problem of off-policy temporal-difference learning	scientific article; zbMATH DE number 6589487

Statements

instance of

0 references

0 references

0 references

0 references

6 June 2016

0 references

full work available at URL

https://arxiv.org/abs/1503.04269

0 references

http://jmlr.csail.mit.edu/papers/v17/14-488.html

0 references

zbMATH Keywords

temporal-difference learning

0 references

off-policy learning

0 references

function approximation

0 references

stability

0 references

convergence

0 references

MaRDI profile type

MaRDI publication profile

0 references

title

An emphatic approach to the problem of off-policy temporal-difference learning (English)

0 references

published in

Journal of Machine Learning Research (JMLR)

0 references

Identifiers

zbMATH Open document ID

1360.68712

0 references

Mathematics Subject Classification ID

68T05

0 references

zbMATH DE Number

6589487

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:2810885