Practical issues in temporal difference learning

From MaRDI portal
Publication:1812929

DOI10.1007/BF00992697zbMath0772.68075WikidataQ56225518 ScholiaQ56225518MaRDI QIDQ1812929

Gerald Tesauro

Publication date: 11 August 1992

Published in: Machine Learning (Search for Journal in Brave)




Related Items (37)

Many-Layered LearningApproximate policy iteration: a survey and some new methodsA review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applicationsLinear least-squares algorithms for temporal difference learningFeature-based methods for large scale dynamic programmingReinforcement learning with replacing eligibility tracesThe loss from imperfect value functions in exceptation-based and minimax-based tasksLearning metric-topological maps for indoor mobile robot navigationModel-based average reward reinforcement learningReinforcement learning of non-Markov decision processesDeep learning of support vector machines with class probability output networksDimension reduction based adaptive dynamic programming for optimal control of discrete-time nonlinear control-affine systemsReinforcement Learning, Bit by BitOptimal liquidation through a limit order book: a neural network and simulation approachA brief survey on nonlinear control using adaptive dynamic programming under engineering-oriented complexitiesPreference-based reinforcement learning: a formal framework and a policy iteration algorithmError controlled actor-criticLearning long-term chess strategies from databasesDynamic programming and value-function approximation in sequential decision problems: error analysis and numerical resultsAn Empirical Overview of the No Free Lunch Theorem and Its Effect on Real-World Machine Learning ClassificationApproximate dynamic programming for stochastic \(N\)-stage optimization with application to optimal consumption under uncertaintySOLVING DYNAMIC WILDLIFE RESOURCE OPTIMIZATION PROBLEMS USING REINFORCEMENT LEARNINGCooperation of categorical and behavioral learning in a practical solution to the abstraction problemApproximate dynamic programming via direct search in the space of value function approximationsTD(λ) learning without eligibility traces: a theoretical analysisTwo-agent IDA*A tutorial survey of reinforcement learningProgramming backgammon using self-teaching neural netsComputer Go: An AI oriented surveyTwo steps reinforcement learningBayesian Exploration for Approximate Dynamic ProgrammingA reinforcement learning approach for dynamic multi-objective optimizationPLAYER CO-MODELLING IN A STRATEGY BOARD GAME: DISCOVERING HOW TO PLAY FASTDeep reinforcement learning for inventory control: a roadmapA theoretical analysis of temporal difference learning in the iterated prisoner's dilemma gameAn approximate dynamic programming method for multi-input multi-output nonlinear systemGames, computers, and artificial intelligence



Cites Work


This page was built for publication: Practical issues in temporal difference learning