Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function (Q1886590): Difference between revisions

From MaRDI portal
Created claim: Wikidata QID (P12): Q40489238, #quickstatements; #temporary_batch_1707216511891
ReferenceBot (talk | contribs)
Changed an Item
 
(2 intermediate revisions by 2 users not shown)
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1016/j.neunet.2004.05.004 / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2009424996 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A near-optimal polynomial time algorithm for learning in certain classes of stochastic games / rank
 
Normal rank
Property / cites work
 
Property / cites work: Dual-control theory. I / rank
 
Normal rank
Property / cites work
 
Property / cites work: Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function / rank
 
Normal rank
Property / cites work
 
Property / cites work: \({\mathcal Q}\)-learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Mean, variance and probabilistic criteria in finite Markov decision processes: A review / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simple statistical gradient-following algorithms for connectionist reinforcement learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: The apparent conflict between estimation and control - a survey of the two-armed bandit problem / rank
 
Normal rank

Latest revision as of 15:24, 7 June 2024

scientific article
Language Label Description Also known as
English
Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function
scientific article

    Statements

    Reliability of internal prediction/estimation and its application. I: Adaptive action selection reflecting reliability of value function (English)
    0 references
    0 references
    0 references
    18 November 2004
    0 references
    0 references
    Internal prediction
    0 references
    Reliability
    0 references
    Model-free reinforcement learning
    0 references
    TD learning
    0 references
    Discount rate
    0 references
    Exploration-exploitation balance
    0 references
    Temperature parameter
    0 references
    Meta-learning
    0 references
    0 references
    0 references