The optimal reward operator in dynamic programming

From MaRDI portal
Publication:1222389

DOI10.1214/aop/1176996558zbMath0318.49021OpenAlexW1975828588MaRDI QIDQ1222389

David Freedman, Michael Orkin, David Blackwell

Publication date: 1974

Published in: The Annals of Probability (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1214/aop/1176996558




Related Items (27)

On the complexity of linear quadratic controlEQUILIBRIUM STRATEGIES IN STOCHASTIC GAMES WITH ADDITIVE COST AND TRANSITION STRUCTURELeavable Gambling Problems with Unbounded UtilitiesNon-randomized strategies in stochastic decision processesA Mixed Value and Policy Iteration Method for Stochastic Control with Universally Measurable PoliciesMulti-factor dynamic investment under uncertaintyStochastic games with metric state spaceThe transformation method for continuous-time Markov decision processesConditions for characterizing the structure of optimal strategies in infinite-horizon dynamic programsBig vee: the story of a function, an algorithm, and three mathematical worldsMeasurable, nonleavable gambling problemsAverage Cost Optimality Inequality for Markov Decision Processes with Borel Spaces and Universally Measurable PoliciesHow to stay in a set or Koenig's lemma for random pathsCountably additive gambling and optimal stoppingOn measurable minimax selectorsOn a theorem of Wald and Wolfowitz on randomization in statisticsEstimates for finite-stage dynamic programsOn the optimality of (s, S)-strategies in a minimax inventory model with average cost criterionSome results on analytic spaces and semi-analytic functions with regard to gambling theoryFinitely Additive Dynamic ProgrammingMeasurable selection theorems for optimization problemsBounded variation of \(\{V_ n\}\) and its limitSemicontinuous nonstationary stochastic games. IIOn structural properties of optimal average cost functions in Markov decision processes with Borel spaces and universally measurable policiesOn Convergence of Value Iteration for a Class of Total Cost Markov Decision ProcessesRisk, uncertainty, and complexityStationary policies and Markov policies in Borel dynamic programming




This page was built for publication: The optimal reward operator in dynamic programming