Practical issues in temporal difference learning

From MaRDI portal
Revision as of 00:20, 30 January 2024 by Import240129110155 (talk | contribs) (Created automatically from import240129110155)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:1812929

DOI10.1007/BF00992697zbMath0772.68075WikidataQ56225518 ScholiaQ56225518MaRDI QIDQ1812929

Gerald Tesauro

Publication date: 11 August 1992

Published in: Machine Learning (Search for Journal in Brave)




Related Items

Many-Layered Learning, Approximate policy iteration: a survey and some new methods, A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications, Linear least-squares algorithms for temporal difference learning, Feature-based methods for large scale dynamic programming, Reinforcement learning with replacing eligibility traces, The loss from imperfect value functions in exceptation-based and minimax-based tasks, Learning metric-topological maps for indoor mobile robot navigation, Model-based average reward reinforcement learning, Reinforcement learning of non-Markov decision processes, Deep learning of support vector machines with class probability output networks, Dimension reduction based adaptive dynamic programming for optimal control of discrete-time nonlinear control-affine systems, Reinforcement Learning, Bit by Bit, Optimal liquidation through a limit order book: a neural network and simulation approach, A brief survey on nonlinear control using adaptive dynamic programming under engineering-oriented complexities, Preference-based reinforcement learning: a formal framework and a policy iteration algorithm, Error controlled actor-critic, Learning long-term chess strategies from databases, Dynamic programming and value-function approximation in sequential decision problems: error analysis and numerical results, An Empirical Overview of the No Free Lunch Theorem and Its Effect on Real-World Machine Learning Classification, Approximate dynamic programming for stochastic \(N\)-stage optimization with application to optimal consumption under uncertainty, SOLVING DYNAMIC WILDLIFE RESOURCE OPTIMIZATION PROBLEMS USING REINFORCEMENT LEARNING, Cooperation of categorical and behavioral learning in a practical solution to the abstraction problem, Approximate dynamic programming via direct search in the space of value function approximations, TD(λ) learning without eligibility traces: a theoretical analysis, Two-agent IDA*, A tutorial survey of reinforcement learning, Programming backgammon using self-teaching neural nets, Computer Go: An AI oriented survey, Two steps reinforcement learning, Bayesian Exploration for Approximate Dynamic Programming, A reinforcement learning approach for dynamic multi-objective optimization, PLAYER CO-MODELLING IN A STRATEGY BOARD GAME: DISCOVERING HOW TO PLAY FAST, Deep reinforcement learning for inventory control: a roadmap, A theoretical analysis of temporal difference learning in the iterated prisoner's dilemma game, An approximate dynamic programming method for multi-input multi-output nonlinear system, Games, computers, and artificial intelligence



Cites Work