A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis

From MaRDI portal
Publication:1771225

DOI10.1023/B:MACH.0000019802.64038.6czbMath1067.68127WikidataQ115010233 ScholiaQ115010233MaRDI QIDQ1771225

Abhijit Gosavi

Publication date: 7 April 2005

Published in: Machine Learning (Search for Journal in Brave)




Related Items (10)




This page was built for publication: A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis