Finding optimal memoryless policies of POMDPs under the expected average reward criterion
From MaRDI portal
Publication:418072
DOI10.1016/J.EJOR.2010.12.014zbMATH Open1237.90250OpenAlexW2055418958MaRDI QIDQ418072FDOQ418072
Hongsheng Xi, Baoqun Yin, Yanjie Li
Publication date: 14 May 2012
Published in: European Journal of Operational Research (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.ejor.2010.12.014
Recommendations
- Finding Optimal Observation-Based Policies for Constrained POMDPs Under the Expected Average Reward Criterion
- Finite-memory strategies in POMDPs with long-run average objectives
- Average optimality for continuous-time Markov decision processes with a policy iteration approach
- On Finding Optimal Policies for Markov Decision Chains: A Unifying Framework for Mean-Variance-Tradeoffs
- The policy iteration algorithm for average reward Markov decision processes with general state space
- Policies without Memory for the Infinite-Armed Bernoulli Bandit under the Average-Reward Criterion
- Policy iteration for bounded-parameter POMDPs
- On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes
Cites Work
- A survey of algorithmic methods for partially observed Markov decision processes
- Title not available (Why is that?)
- The Optimal Control of Partially Observable Markov Processes over a Finite Horizon
- Title not available (Why is that?)
- Perturbation realization, potentials, and sensitivity analysis of Markov processes
- Basic ideas for event-based optimization of Markov systems
- Stochastic learning and optimization. A sensitivity-based approach.
- Simulation-based optimization of Markov reward processes
- CONVERGENCE OF SIMULATION-BASED POLICY ITERATION
- Title not available (Why is that?)
- Title not available (Why is that?)
- Optimization of a special case of continuous-time Markov decision processes with compact action set
- The $n$th-Order Bias Optimality for Multichain Markov Decision Processes
- Event-Based Optimization of Markov Systems
- Potential-Based Online Policy Iteration Algorithms for Markov Decision Processes
- Performance optimization algorithms based on potentials for semi-Markov control processes
Cited In (4)
Uses Software
This page was built for publication: Finding optimal memoryless policies of POMDPs under the expected average reward criterion
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q418072)