Practical issues in temporal difference learning

From MaRDI portal

Revision as of 00:20, 30 January 2024 by Import240129110155 (talk | contribs) (Created automatically from import240129110155)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:1812929

Jump to:navigation, search

DOI10.1007/BF00992697zbMath0772.68075DBLPjournals/ml/Tesauro92WikidataQ56225518 ScholiaQ56225518MaRDI QIDQ1812929

Gerald Tesauro

Publication date: 11 August 1992

Published in: Machine Learning (Search for Journal in Brave)

zbMATH Keywords

neural network temporal difference learning feature discovery backgammon connectionist methods

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05)

Cites Work

Related Items (39)

Many-Layered Learning ⋮ Approximate policy iteration: a survey and some new methods ⋮ A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications ⋮ Linear least-squares algorithms for temporal difference learning ⋮ Feature-based methods for large scale dynamic programming ⋮ Reinforcement learning with replacing eligibility traces ⋮ The loss from imperfect value functions in exceptation-based and minimax-based tasks ⋮ Learning metric-topological maps for indoor mobile robot navigation ⋮ Model-based average reward reinforcement learning ⋮ Reinforcement learning of non-Markov decision processes ⋮ Deep learning of support vector machines with class probability output networks ⋮ Dimension reduction based adaptive dynamic programming for optimal control of discrete-time nonlinear control-affine systems ⋮ New Versions of Gradient Temporal-Difference Learning ⋮ Reinforcement Learning, Bit by Bit ⋮ Optimal liquidation through a limit order book: a neural network and simulation approach ⋮ A brief survey on nonlinear control using adaptive dynamic programming under engineering-oriented complexities ⋮ Preference-based reinforcement learning: a formal framework and a policy iteration algorithm ⋮ Error controlled actor-critic ⋮ Learning long-term chess strategies from databases ⋮ Dynamic programming and value-function approximation in sequential decision problems: error analysis and numerical results ⋮ An Empirical Overview of the No Free Lunch Theorem and Its Effect on Real-World Machine Learning Classification ⋮ Approximate dynamic programming for stochastic \(N\)-stage optimization with application to optimal consumption under uncertainty ⋮ SOLVING DYNAMIC WILDLIFE RESOURCE OPTIMIZATION PROBLEMS USING REINFORCEMENT LEARNING ⋮ Cooperation of categorical and behavioral learning in a practical solution to the abstraction problem ⋮ Approximate dynamic programming via direct search in the space of value function approximations ⋮ Eligibility traces and forgetting factor in recursive least-squares-based temporal difference ⋮ TD(λ) learning without eligibility traces: a theoretical analysis ⋮ Two-agent IDA* ⋮ A tutorial survey of reinforcement learning ⋮ Programming backgammon using self-teaching neural nets ⋮ Computer Go: An AI oriented survey ⋮ Two steps reinforcement learning ⋮ Bayesian Exploration for Approximate Dynamic Programming ⋮ A reinforcement learning approach for dynamic multi-objective optimization ⋮ PLAYER CO-MODELLING IN A STRATEGY BOARD GAME: DISCOVERING HOW TO PLAY FAST ⋮ Deep reinforcement learning for inventory control: a roadmap ⋮ A theoretical analysis of temporal difference learning in the iterated prisoner's dilemma game ⋮ An approximate dynamic programming method for multi-input multi-output nonlinear system ⋮ Games, computers, and artificial intelligence

This page was built for publication: Practical issues in temporal difference learning

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1812929&oldid=11987884"