Algorithms for Reinforcement Learning

From MaRDI portal
Publication:3588852

DOI10.2200/S00268ED1V01Y201005AIM009zbMath1205.68320OpenAlexW4211221179MaRDI QIDQ3588852

Csaba Szepesvári

Publication date: 10 September 2010

Published in: Synthesis Lectures on Artificial Intelligence and Machine Learning (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.2200/s00268ed1v01y201005aim009




Related Items (41)

Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storageA Two-Timescale Stochastic Algorithm Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-CriticA convex optimization approach to dynamic programming in continuous state and action spacesAdaptive playouts for online learning of policies during Monte Carlo tree searchA unified DC programming framework and efficient DCA based approaches for large scale batch reinforcement learningContinuous-action planning for discounted infinite-horizon nonlinear optimal control with Lipschitz valuesClosed-form Approximations in Multi-asset Market MakingOnline spatio-temporal matching in stochastic and dynamic domainsUnnamed ItemEfficient model-based reinforcement learning for approximate online optimal controlComputational Benefits of Intermediate Rewards for Goal-Reaching Policy LearningHypervolume indicator and dominance reward based multi-objective Monte-Carlo tree searchA systematic study on meta-heuristic approaches for solving the graph coloring problemOn learning and branching: a surveyRobust adaptive dynamic programming for linear and nonlinear systems: an overviewMinimax PAC bounds on the sample complexity of reinforcement learning with a generative modelOptimal activation of halting multi‐armed bandit modelsFormalization of methods for the development of autonomous artificial intelligence systemsDynamic treatment regimes: technical challenges and applicationsDeep reinforcement trading with predictable returnsModel selection in reinforcement learningApproximate Q Learning for Controlled Diffusion Processes and Its Near OptimalityReinforcement learning algorithms with function approximation: recent advances and applicationsCrowd computing as a cooperation problem: An evolutionary approachAsymptotic analysis of value prediction by well-specified and misspecified modelsAbstraction from demonstration for efficient reinforcement learning in high-dimensional domainsA unified framework for stochastic optimizationMarkov decision processes with sequential sensor measurementsA Reinforcement Learning Neural Network for Robotic Manipulator ControlUnnamed ItemUnnamed ItemProximal algorithms and temporal difference methods for solving fixed point problemsSome recent advances in learning and adaptation for uncertain feedback control systemsPreference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithmBayesian Exploration for Approximate Dynamic ProgrammingFinite-Time Performance of Distributed Temporal-Difference Learning with Linear Function ApproximationOn Convergence of Value Iteration for a Class of Total Cost Markov Decision ProcessesFundamental design principles for reinforcement learning algorithmsEmpirical Q-Value IterationUnnamed ItemUnnamed Item


Uses Software



This page was built for publication: Algorithms for Reinforcement Learning