INDEXABILITY OF BANDIT PROBLEMS WITH RESPONSE DELAYS
From MaRDI portal
Publication:3585147
DOI10.1017/S0269964810000021zbMath1200.90066MaRDI QIDQ3585147
Publication date: 19 August 2010
Published in: Probability in the Engineering and Informational Sciences (Search for Journal in Brave)
Related Items
MULTI-ARMED BANDITS UNDER GENERAL DEPRECIATION AND COMMITMENT ⋮ A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits ⋮ Robust control of the multi-armed bandit problem ⋮ Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges
Cites Work
- Dynamic priority allocation via restless bandit marginal productivity indices
- One-armed bandit models with continuous and delayed responses
- Optimal learning and experimentation in bandit problems.
- Optimality of monotonic policies for two-action Markovian decision processes, with applications to control of queues with delayed information
- New adaptive designs for delayed response models
- Dynamic Assortment with Demand Learning for Seasonal Consumer Goods
- A Learning Approach for Interactive Marketing to a Customer Segment
- On an index policy for restless bandits
- Turnpike Optimality of Smith's Rule in Parallel Machines Stochastic Scheduling
- A Dynamic Inventory Model with Stochastic Lead Times