A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis
From MaRDI portal
Publication:1771225
DOI10.1023/B:MACH.0000019802.64038.6czbMath1067.68127WikidataQ115010233 ScholiaQ115010233MaRDI QIDQ1771225
Publication date: 7 April 2005
Published in: Machine Learning (Search for Journal in Brave)
Related Items (10)
Least squares approximate policy iteration for learning bid prices in choice-based revenue management ⋮ Simulation optimization for revenue management of airlines with cancellations and overbooking ⋮ Convergence of deep fictitious play for stochastic differential games ⋮ Reinforcement learning for joint pricing, lead-time and scheduling decisions in make-to-order systems ⋮ A reinforcement learning approach to distribution-free capacity allocation for sea cargo revenue management ⋮ Integrated revenue management approaches for capacity control with planned upgrades ⋮ Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning ⋮ Dynamic cruise ship revenue management ⋮ A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs ⋮ Look-ahead control of conveyor-serviced production station by using potential-based online policy iteration
This page was built for publication: A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis