A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning (Q859737): Difference between revisions

From MaRDI portal
Import240304020342 (talk | contribs)
Set profile property.
ReferenceBot (talk | contribs)
Changed an Item
 
(One intermediate revision by one other user not shown)
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1007/s10626-006-8134-8 / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2062541405 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Functional Approximations and Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3997575 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4209222 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4858374 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Technical update: Least-squares temporal difference learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5477859 / rank
 
Normal rank
Property / cites work
 
Property / cites work: The convergence of \(TD(\lambda)\) for general \(\lambda\) / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the existence of fixed points for approximate value iteration and temporal-difference learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4368791 / rank
 
Normal rank
Property / cites work
 
Property / cites work: 10.1162/1532443041827907 / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the convergence of temporal-difference learning with linear function approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: An analysis of temporal-difference learning with function approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives / rank
 
Normal rank
Property / cites work
 
Property / cites work: Extensions of the multiarmed bandit problem: The discounted case / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5477861 / rank
 
Normal rank

Latest revision as of 12:40, 25 June 2024

scientific article
Language Label Description Also known as
English
A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
scientific article

    Statements

    A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning (English)
    0 references
    0 references
    0 references
    18 January 2007
    0 references
    0 references
    0 references
    0 references
    0 references
    Dynamic programming
    0 references
    Kalman filter
    0 references
    Optimal stopping
    0 references
    Queueing
    0 references
    Recursive least-squares
    0 references
    Reinforcement learning
    0 references
    Temporal-difference learning
    0 references
    0 references