Q4558197 (Q4558197): Difference between revisions

From MaRDI portal
Item:Q4558197
Importer (talk | contribs)
Changed label, description and/or aliases in en, and other parts
Merged Item into Q3305109
Tag: Replaced
label / enlabel / en
On Generalized Bellman Equations and Temporal-Difference Learning
description / endescription / en
scientific article; zbMATH DE number 6982339
Property / instance of
 
Property / instance of: scholarly article / rank
Normal rank
 
Property / zbMATH Open document ID
 
Property / zbMATH Open document ID: 1465.90117 / rank
Normal rank
 
Property / author
 
Property / author: Huizhen Yu / rank
Normal rank
 
Property / author
 
Property / author: Ashique Rupam Mahmood / rank
Normal rank
 
Property / author
 
Property / author: Richard S. Sutton / rank
Normal rank
 
Property / publication date
21 November 2018
Timestamp+2018-11-21T00:00:00Z
Timezone+00:00
CalendarGregorian
Precision1 day
Before0
After0
 
Property / publication date: 21 November 2018 / rank
Normal rank
 
Property / full work available at URL
 
Property / full work available at URL: https://arxiv.org/abs/1704.04463 / rank
Normal rank
 
Property / full work available at URL
 
Property / full work available at URL: http://jmlr.csail.mit.edu/papers/v19/17-283.html / rank
Normal rank
 
Property / Mathematics Subject Classification ID
 
Property / Mathematics Subject Classification ID: 90C40 / rank
Normal rank
 
Property / Mathematics Subject Classification ID
 
Property / Mathematics Subject Classification ID: 60J20 / rank
Normal rank
 
Property / Mathematics Subject Classification ID
 
Property / Mathematics Subject Classification ID: 68T05 / rank
Normal rank
 
Property / Mathematics Subject Classification ID
 
Property / Mathematics Subject Classification ID: 90C39 / rank
Normal rank
 
Property / zbMATH DE Number
 
Property / zbMATH DE Number: 6982339 / rank
Normal rank
 
Property / zbMATH Keywords
Markov decision process
 
Property / zbMATH Keywords: Markov decision process / rank
Normal rank
 
Property / zbMATH Keywords
approximate policy evaluation
 
Property / zbMATH Keywords: approximate policy evaluation / rank
Normal rank
 
Property / zbMATH Keywords
generalized Bellman equation
 
Property / zbMATH Keywords: generalized Bellman equation / rank
Normal rank
 
Property / zbMATH Keywords
reinforcement learning
 
Property / zbMATH Keywords: reinforcement learning / rank
Normal rank
 
Property / zbMATH Keywords
temporal-difference method
 
Property / zbMATH Keywords: temporal-difference method / rank
Normal rank
 
Property / zbMATH Keywords
Markov chain
 
Property / zbMATH Keywords: Markov chain / rank
Normal rank
 
Property / zbMATH Keywords
randomized stopping time
 
Property / zbMATH Keywords: randomized stopping time / rank
Normal rank
 
Property / describes a project that uses
 
Property / describes a project that uses: SBEED / rank
Normal rank
 
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
Normal rank
 
Property / arXiv ID
 
Property / arXiv ID: 1704.04463 / rank
Normal rank
 
Property / arXiv classification
cs.LG
 
Property / arXiv classification: cs.LG / rank
Normal rank
 
Property / arXiv classification
math.OC
 
Property / arXiv classification: math.OC / rank
Normal rank
 
links / mardi / namelinks / mardi / name

Revision as of 10:13, 6 May 2024

No description defined
Language Label Description Also known as
English
No label defined
No description defined

    Statements