Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path (Q1009248): Difference between revisions

From MaRDI portal
Added link to MaRDI item.
ReferenceBot (talk | contribs)
Changed an Item
 
(2 intermediate revisions by 2 users not shown)
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1007/s10994-007-5038-2 / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2104753538 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Neural Network Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Learning Near-Optimal Policies with Bellman-Residual Minimization Based Fitted Policy Iteration and a Single Sample Path / rank
 
Normal rank
Property / cites work
 
Property / cites work: Adaptive estimation in autoregression or \(\beta\)-mixing regression via model selection / rank
 
Normal rank
Property / cites work
 
Property / cites work: Functional Approximations and Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic optimal control. The discrete time case / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4257216 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5477859 / rank
 
Normal rank
Property / cites work
 
Property / cites work: MIXING AND MOMENT PROPERTIES OF VARIOUS GARCH AND STOCHASTIC VOLATILITY MODELS / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5543516 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Mixing Conditions for Markov Chains / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4881152 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Mixing: Properties and examples / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3093261 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4434179 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A distribution-free theory of nonparametric regression / rank
 
Normal rank
Property / cites work
 
Property / cites work: Sphere packing numbers for subsets of the Boolean \(n\)-cube with bounded Vapnik-Chervonenkis dimension / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3266141 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3218572 / rank
 
Normal rank
Property / cites work
 
Property / cites work: 10.1162/1532443041827907 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Nonparametric time series prediction through adaptive model selection / rank
 
Normal rank
Property / cites work
 
Property / cites work: Markov chains and stochastic stability / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3093292 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Histogram regression estimation using data-dependent partitions / rank
 
Normal rank
Property / cites work
 
Property / cites work: Kernel-based reinforcement learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convergence of stochastic processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4001821 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Generalized polynomial approximations in Markovian decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5477860 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Rates of convergence for empirical processes of stationary mixing sequences / rank
 
Normal rank

Latest revision as of 10:06, 1 July 2024

scientific article
Language Label Description Also known as
English
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
scientific article

    Statements

    Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path (English)
    0 references
    0 references
    0 references
    0 references
    31 March 2009
    0 references
    reinforcement learning
    0 references
    policy iteration
    0 references
    Bellman-residual minimization
    0 references
    least-squares temporal difference learning
    0 references
    off-policy learning
    0 references
    nonparametric regression
    0 references
    least-squares regression
    0 references
    finite-sample bounds
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references

    Identifiers