Practical issues in temporal difference learning
From MaRDI portal
Publication:1812929
DOI10.1007/BF00992697zbMath0772.68075WikidataQ56225518 ScholiaQ56225518MaRDI QIDQ1812929
Publication date: 11 August 1992
Published in: Machine Learning (Search for Journal in Brave)
Related Items (37)
Many-Layered Learning ⋮ Approximate policy iteration: a survey and some new methods ⋮ A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications ⋮ Linear least-squares algorithms for temporal difference learning ⋮ Feature-based methods for large scale dynamic programming ⋮ Reinforcement learning with replacing eligibility traces ⋮ The loss from imperfect value functions in exceptation-based and minimax-based tasks ⋮ Learning metric-topological maps for indoor mobile robot navigation ⋮ Model-based average reward reinforcement learning ⋮ Reinforcement learning of non-Markov decision processes ⋮ Deep learning of support vector machines with class probability output networks ⋮ Dimension reduction based adaptive dynamic programming for optimal control of discrete-time nonlinear control-affine systems ⋮ Reinforcement Learning, Bit by Bit ⋮ Optimal liquidation through a limit order book: a neural network and simulation approach ⋮ A brief survey on nonlinear control using adaptive dynamic programming under engineering-oriented complexities ⋮ Preference-based reinforcement learning: a formal framework and a policy iteration algorithm ⋮ Error controlled actor-critic ⋮ Learning long-term chess strategies from databases ⋮ Dynamic programming and value-function approximation in sequential decision problems: error analysis and numerical results ⋮ An Empirical Overview of the No Free Lunch Theorem and Its Effect on Real-World Machine Learning Classification ⋮ Approximate dynamic programming for stochastic \(N\)-stage optimization with application to optimal consumption under uncertainty ⋮ SOLVING DYNAMIC WILDLIFE RESOURCE OPTIMIZATION PROBLEMS USING REINFORCEMENT LEARNING ⋮ Cooperation of categorical and behavioral learning in a practical solution to the abstraction problem ⋮ Approximate dynamic programming via direct search in the space of value function approximations ⋮ TD(λ) learning without eligibility traces: a theoretical analysis ⋮ Two-agent IDA* ⋮ A tutorial survey of reinforcement learning ⋮ Programming backgammon using self-teaching neural nets ⋮ Computer Go: An AI oriented survey ⋮ Two steps reinforcement learning ⋮ Bayesian Exploration for Approximate Dynamic Programming ⋮ A reinforcement learning approach for dynamic multi-objective optimization ⋮ PLAYER CO-MODELLING IN A STRATEGY BOARD GAME: DISCOVERING HOW TO PLAY FAST ⋮ Deep reinforcement learning for inventory control: a roadmap ⋮ A theoretical analysis of temporal difference learning in the iterated prisoner's dilemma game ⋮ An approximate dynamic programming method for multi-input multi-output nonlinear system ⋮ Games, computers, and artificial intelligence
Cites Work
- A pattern classification approach to evaluation function learning
- A parallel network that learns to play backgammon
- Multilayer feedforward networks are universal approximators
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
- A comparison and evaluation of three machine learning procedures as applied to the game of checkers
- Learnability and the Vapnik-Chervonenkis dimension
- On Optimal Doubling in Backgammon
- Learning representations by back-propagating errors
- On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities
- A Stochastic Approximation Method
This page was built for publication: Practical issues in temporal difference learning