On Generalized Bellman Equations and Temporal-Difference Learning (Q3305109): Difference between revisions
From MaRDI portal
EloiFerrer (talk | contribs) Changed label, description and/or aliases in en, and other parts |
EloiFerrer (talk | contribs) Merged Item from Q4558197 |
||||||||||||||
description / en | description / en | ||||||||||||||
scientific article; zbMATH DE number 6982339 | |||||||||||||||
Property / zbMATH Open document ID | |||||||||||||||
Property / zbMATH Open document ID: 1465.90117 / rank | |||||||||||||||
Normal rank | |||||||||||||||
Property / publication date | |||||||||||||||
21 November 2018
| |||||||||||||||
Property / publication date: 21 November 2018 / rank | |||||||||||||||
Normal rank | |||||||||||||||
Property / full work available at URL | |||||||||||||||
Property / full work available at URL: http://jmlr.csail.mit.edu/papers/v19/17-283.html / rank | |||||||||||||||
Normal rank | |||||||||||||||
Property / Mathematics Subject Classification ID | |||||||||||||||
Property / Mathematics Subject Classification ID: 60J20 / rank | |||||||||||||||
Normal rank | |||||||||||||||
Property / Mathematics Subject Classification ID | |||||||||||||||
Property / Mathematics Subject Classification ID: 90C39 / rank | |||||||||||||||
Normal rank | |||||||||||||||
Property / zbMATH DE Number | |||||||||||||||
Property / zbMATH DE Number: 6982339 / rank | |||||||||||||||
Normal rank | |||||||||||||||
Property / zbMATH Keywords | |||||||||||||||
approximate policy evaluation | |||||||||||||||
Property / zbMATH Keywords: approximate policy evaluation / rank | |||||||||||||||
Normal rank | |||||||||||||||
Property / zbMATH Keywords | |||||||||||||||
reinforcement learning | |||||||||||||||
Property / zbMATH Keywords: reinforcement learning / rank | |||||||||||||||
Normal rank | |||||||||||||||
Property / zbMATH Keywords | |||||||||||||||
temporal-difference method | |||||||||||||||
Property / zbMATH Keywords: temporal-difference method / rank | |||||||||||||||
Normal rank | |||||||||||||||
Property / describes a project that uses | |||||||||||||||
Property / describes a project that uses: SBEED / rank | |||||||||||||||
Normal rank | |||||||||||||||
Property / arXiv classification | |||||||||||||||
cs.LG | |||||||||||||||
Property / arXiv classification: cs.LG / rank | |||||||||||||||
Normal rank | |||||||||||||||
Property / arXiv classification | |||||||||||||||
math.OC | |||||||||||||||
Property / arXiv classification: math.OC / rank | |||||||||||||||
Normal rank |
Latest revision as of 10:14, 6 May 2024
scientific article; zbMATH DE number 6982339
Language | Label | Description | Also known as |
---|---|---|---|
English | On Generalized Bellman Equations and Temporal-Difference Learning |
scientific article; zbMATH DE number 6982339 |
Statements
On Generalized Bellman Equations and Temporal-Difference Learning (English)
0 references
5 August 2020
0 references
21 November 2018
0 references
Markov decision process
0 references
policy evaluation
0 references
generalized Bellman equation
0 references
temporal differences
0 references
Markov chain
0 references
randomized stopping time
0 references
approximate policy evaluation
0 references
reinforcement learning
0 references
temporal-difference method
0 references
cs.LG
0 references
math.OC
0 references