Potential-based least-squares policy iteration for a parameterized feedback control system
DOI10.1007/s10957-015-0809-6zbMath1342.49047OpenAlexW2282691152MaRDI QIDQ289143
Kanjian Zhang, Kang Cheng, Haikun Wei, Shu-Min Fei
Publication date: 27 May 2016
Published in: Journal of Optimization Theory and Applications (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s10957-015-0809-6
optimal controlMarkov decision processesfeedback controlstochastic systemleast-squares policy iterationperformance potentialpotential estimation algorithmtemporal difference learning method
Lua error in Module:PublicationMSCList at line 37: attempt to index local 'msc_result' (a nil value).
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Temporal difference-based policy iteration for optimal control of stochastic systems
- Weak convergence theorems for nonexpansive mappings in Banach spaces
- Single sample path-based optimization of Markov chains
- Basic ideas for event-based optimization of Markov systems
- Least squares policy evaluation algorithms with linear function approximation
- Policy iteration based feedback control
- Approximate policy iteration: a survey and some new methods
- A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
- The policy iteration algorithm for average reward Markov decision processes with general state space
- 10.1162/1532443041827907
- Approximate Dynamic Programming
- A trust region method based on interior point techniques for nonlinear programming.
This page was built for publication: Potential-based least-squares policy iteration for a parameterized feedback control system