Dynamic programming, Markov chains, and the method of successive approximations

From MaRDI portal
Publication:2393803

DOI10.1016/0022-247X(63)90017-9zbMath0124.36404OpenAlexW2081374871MaRDI QIDQ2393803

Douglas J. White

Publication date: 1963

Published in: Journal of Mathematical Analysis and Applications (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1016/0022-247x(63)90017-9




Related Items (44)

Computation techniques for large scale undiscounted markov decision processesNumerical methods for controlled and uncontrolled multiplexing and queueing systemsDual bounds on the equilibrium distribution of a finite Markov chainA methodology for computation reduction for specially structured large scale Markov decision problemsHow fast do equilibrium payoff sets converge in repeated games?On the Control of a Queueing System with Aging State InformationContraction mappings underlying undiscounted Markov decision problems. IIApproximation of average cost optimal policies for general Markov decision processes with unbounded costsValue iteration in average cost Markov control processes on Borel spacesRelative Value Iteration for Stochastic Differential GamesUnnamed ItemSome basic concepts of numerical treatment of Markov decision modelsOn the solvability of Bellman's functional equations for Markov renewal programmingQuality assurance and stage dynamics in multi-stage manufacturing. Part IIConnectedness conditions used in finite state Markov decision processesA note on the convergence rate of the value iteration scheme in controlled Markov chainsNonstationary Markov decision problems with converging parametersThe blast furnaces problemOn the global convergence of relative value iteration for infinite-horizon risk-sensitive control of diffusions\(R(\lambda)\) imitation learning for automatic generation control of interconnected power gridsThe method of value oriented successive approximations for the average reward Markov decision processAn iterative method for approximating average cost optimal (s,S) inventory policiesOpen Problem—Convergence and Asymptotic Optimality of the Relative Value Iteration in Ergodic ControlA value iteration method for undiscounted multichain Markov decision processesWater reservoir control under economic, social and environmental constraintsUnnamed ItemUnnamed ItemA structured pattern matrix algorithm for multichain Markov decision processesMarkov decision processesIterative algorithms for solving undiscounted bellman equationsSpectral inequalities for nonnegative tensors and their tropical analoguesExponential convergence of products of stochastic matricesContraction mappings underlying undiscounted Markov decision problemsImproved iterative computation of the expected discounted return in Markov and semi-Markov chainsReceding horizon control for water resources managementOptimal stochastic controlIterative solution of the functional equations of undiscounted Markov renewal programmingOn the Optimality of Trunk Reservation in Overflow ProcessesUnnamed ItemMARKOV DECISION PROCESSESValue iteration in countable state average cost Markov decision processes with unbounded costsFinite state approximation algorithms for average cost denumerable state Markov decision processesGeneralized polynomial approximations in Markovian decision processesOptimal pricing for a \(\mathrm{GI}/\mathrm{M}/k/N\) queue with several customer types and holding costs



Cites Work


This page was built for publication: Dynamic programming, Markov chains, and the method of successive approximations