Pages that link to "Item:Q5273720"
From MaRDI portal
The following pages link to Potential-Based Online Policy Iteration Algorithms for Markov Decision Processes (Q5273720):
Displayed 6 items.
- Finding optimal memoryless policies of POMDPs under the expected average reward criterion (Q418072) (← links)
- Temporal difference-based policy iteration for optimal control of stochastic systems (Q467477) (← links)
- Basic ideas for event-based optimization of Markov systems (Q1773104) (← links)
- Early developments of control theory in China (Q2512086) (← links)
- A performance gradient perspective on gradient‐based policy iteration and a modified value iteration (Q3613729) (← links)
- Look-ahead control of conveyor-serviced production station by using potential-based online policy iteration (Q3654586) (← links)