The multi-armed bandit, with constraints
From MaRDI portal
Publication:378726
DOI10.1007/s10479-012-1250-yzbMath1274.90470arXiv1203.4640OpenAlexW1965106111MaRDI QIDQ378726
Uriel G. Rothblum, Eugene A. Feinberg, Eric V. Denardo
Publication date: 12 November 2013
Published in: Annals of Operations Research (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1203.4640
Related Items
Four proofs of Gittins' multiarmed bandit theorem ⋮ Index policy for multiarmed bandit problem with dynamic risk measures ⋮ MULTI-ARMED BANDITS UNDER GENERAL DEPRECIATION AND COMMITMENT ⋮ Coordinating Pricing and Inventory Replenishment with Nonparametric Demand Learning ⋮ An asymptotically optimal strategy for constrained multi-armed bandit problems ⋮ On the reduction of total‐cost and average‐cost MDPs to discounted MDPs ⋮ Robust control of the multi-armed bandit problem
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- A generalized Gittins index for a Markov chain and its recursive calculation
- On the Gittins index for multiarmed bandits
- Multi-armed bandits in discrete and continuous time
- A short proof of the Gittins index theorem
- Dynamic allocation problems in continuous time
- Why imitate, and if so, how? A boundedly rational approach to multi-armed bandits
- Splitting Randomized Stationary Policies in Total-Reward Markov Decision Processes
- A (2/3)n3 Fast-Pivoting Algorithm for the Gittins Index and Optimal Stopping of a Markov Chain
- Multi‐Armed Bandit Allocation Indices
- Branching Bandit Processes
- A Turnpike Theorem For A Risk-Sensitive Markov Decision Process with Stopping
- Extensions of the multiarmed bandit problem: The discounted case
- The Multi-Armed Bandit Problem: Decomposition and Computation
- Conservation Laws, Extended Polymatroids and Multiarmed Bandit Problems; A Polyhedral Approach to Indexable Systems
- Risk-Sensitive and Risk-Neutral Multiarmed Bandits
- Contraction Mappings in the Theory Underlying Dynamic Programming
- Discrete Dynamic Programming with Sensitive Discount Optimality Criteria
This page was built for publication: The multi-armed bandit, with constraints