A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis

Related Items (10)

Least squares approximate policy iteration for learning bid prices in choice-based revenue management ⋮ Simulation optimization for revenue management of airlines with cancellations and overbooking ⋮ Convergence of deep fictitious play for stochastic differential games ⋮ Reinforcement learning for joint pricing, lead-time and scheduling decisions in make-to-order systems ⋮ A reinforcement learning approach to distribution-free capacity allocation for sea cargo revenue management ⋮ Integrated revenue management approaches for capacity control with planned upgrades ⋮ Minimizing mean weighted tardiness in unrelated parallel machine scheduling with reinforcement learning ⋮ Dynamic cruise ship revenue management ⋮ A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs ⋮ Look-ahead control of conveyor-serviced production station by using potential-based online policy iteration

This page was built for publication: A reinforcement learning algorithm based on policy iteration for average reward: Empirical results with yield management and convergence analysis