Practical issues in temporal difference learning
From MaRDI portal
Publication:1812929
DOI10.1007/BF00992697zbMath0772.68075WikidataQ56225518 ScholiaQ56225518MaRDI QIDQ1812929
Publication date: 11 August 1992
Published in: Machine Learning (Search for Journal in Brave)
Related Items
Many-Layered Learning, Approximate policy iteration: a survey and some new methods, A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications, Linear least-squares algorithms for temporal difference learning, Feature-based methods for large scale dynamic programming, Reinforcement learning with replacing eligibility traces, The loss from imperfect value functions in exceptation-based and minimax-based tasks, Learning metric-topological maps for indoor mobile robot navigation, Model-based average reward reinforcement learning, Reinforcement learning of non-Markov decision processes, Deep learning of support vector machines with class probability output networks, Dimension reduction based adaptive dynamic programming for optimal control of discrete-time nonlinear control-affine systems, Reinforcement Learning, Bit by Bit, Optimal liquidation through a limit order book: a neural network and simulation approach, A brief survey on nonlinear control using adaptive dynamic programming under engineering-oriented complexities, Preference-based reinforcement learning: a formal framework and a policy iteration algorithm, Error controlled actor-critic, Learning long-term chess strategies from databases, Dynamic programming and value-function approximation in sequential decision problems: error analysis and numerical results, An Empirical Overview of the No Free Lunch Theorem and Its Effect on Real-World Machine Learning Classification, Approximate dynamic programming for stochastic \(N\)-stage optimization with application to optimal consumption under uncertainty, SOLVING DYNAMIC WILDLIFE RESOURCE OPTIMIZATION PROBLEMS USING REINFORCEMENT LEARNING, Cooperation of categorical and behavioral learning in a practical solution to the abstraction problem, Approximate dynamic programming via direct search in the space of value function approximations, TD(λ) learning without eligibility traces: a theoretical analysis, Two-agent IDA*, A tutorial survey of reinforcement learning, Programming backgammon using self-teaching neural nets, Computer Go: An AI oriented survey, Two steps reinforcement learning, Bayesian Exploration for Approximate Dynamic Programming, A reinforcement learning approach for dynamic multi-objective optimization, PLAYER CO-MODELLING IN A STRATEGY BOARD GAME: DISCOVERING HOW TO PLAY FAST, Deep reinforcement learning for inventory control: a roadmap, A theoretical analysis of temporal difference learning in the iterated prisoner's dilemma game, An approximate dynamic programming method for multi-input multi-output nonlinear system, Games, computers, and artificial intelligence
Cites Work
- A pattern classification approach to evaluation function learning
- A parallel network that learns to play backgammon
- Multilayer feedforward networks are universal approximators
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
- A comparison and evaluation of three machine learning procedures as applied to the game of checkers
- Learnability and the Vapnik-Chervonenkis dimension
- On Optimal Doubling in Backgammon
- Learning representations by back-propagating errors
- On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities
- A Stochastic Approximation Method