scientific article

Publication date: 1980

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.

infinite time horizon Gittins index optimal resource allocation dynamic allocation indices proof simplification index rule multi-armed bandit process optimal expected total reward

Mathematics Subject Classification ID

Sequential statistical methods (62L99) Dynamic programming (90C39) Markov and semi-Markov decision processes (90C40)

Related Items (67)

Optimal learning and experimentation in bandit problems. ⋮ Applicable stochastic control: From theory to practice ⋮ A bisection/successive approximation method for computing Gittins indices ⋮ Optimal control of single-server queueing networks ⋮ Multi-armed bandit problem revisited ⋮ The performance of forwards induction policies ⋮ Parallel search for the best alternative ⋮ Strategic conversations under imperfect information: epistemic message exchange games ⋮ Computational aspects in applied stochastic control ⋮ Stochastic scheduling and forwards induction ⋮ Multi-armed bandits with simple arms ⋮ Bandit and covariate processes, with finite or non-denumerable set of arms ⋮ Optimistic Gittins Indices ⋮ Competing Markov decision processes ⋮ Incentivizing Exploration with Heterogeneous Value of Money ⋮ Multi-armed bandit processes with optimal selection of the operating times ⋮ On Gittins' index theorem in continuous time ⋮ Four proofs of Gittins' multiarmed bandit theorem ⋮ Continue, quit, restart probability model ⋮ MULTI-ARMED BANDITS WITH COVARIATES:THEORY AND APPLICATIONS ⋮ Bandit Theory: Applications to Learning Healthcare Systems and Clinical Trials ⋮ Kullback-Leibler upper confidence bounds for optimal sequential allocation ⋮ Stochastic scheduling on a single machine subject to multiple breakdowns according to different probabilities ⋮ The multi-armed bandit, with constraints ⋮ Multi-machine preventive maintenance scheduling with imperfect interventions: a restless bandit approach ⋮ The archievable region method in the optimal control of queueing systems; formulations, bounds and policies ⋮ Optimal activation of halting multi‐armed bandit models ⋮ Common value experimentation ⋮ Index policy for multiarmed bandit problem with dynamic risk measures ⋮ Encounters with Martingales in Statistics and Stochastic Optimization ⋮ An application of Edgeworth expansion in Bayesian inferences: Optimal sample sizes in clinical trials ⋮ A perpetual search for talents across overlapping generations: a learning process ⋮ ON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITS ⋮ Empirical Gittins index strategies with \(\varepsilon\)-explorations for multi-armed bandit problems ⋮ Index policies for discounted bandit problems with availability constraints ⋮ A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits ⋮ Decomposing risk in an exploitation-exploration problem with endogenous termination time ⋮ Branching bandits: A sequential search process with correlated pay-offs. ⋮ Sequential process control under capacity constraints. ⋮ Dynamic priority allocation via restless bandit marginal productivity indices ⋮ A Knowledge Gradient Policy for Sequencing Experiments to Identify the Structure of RNA Molecules Using a Sparse Additive Belief Model ⋮ Optimal learning before choice ⋮ Some indexable families of restless bandit problems ⋮ Stochastic scheduling: a short history of index policies and new approaches to index generation for dynamic resource allocation ⋮ A generalized Gittins index for a Markov chain and its recursive calculation ⋮ Max-plus decomposition of supermartingales and convex order. Application to American options and portfolio insurance ⋮ Optimal stopping for Brownian motion with applications to sequential analysis and option pricing ⋮ Explicit Gittins Indices for a Class of Superdiffusive Processes ⋮ An asymptotically optimal heuristic for general nonstationary finite-horizon restless multi-armed, multi-action bandits ⋮ Tax problems in the undiscounted case ⋮ Survey of linear programming for standard and nonstandard Markovian control problems. Part II: Applications ⋮ A survey of Markov decision models for control of networks of queues ⋮ Efficiency in lung transplant allocation strategies ⋮ Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures ⋮ Un ordonnancement dynamique de tâches stochastiques sur un seul processeur ⋮ Unnamed Item ⋮ Gittins' theorem under uncertainty ⋮ Finite state multi-armed bandit problems: Sensitive-discount, average-reward and average-overtaking optimality ⋮ Dismemberment and design for controlling the replication variance of regret for the multi-armed bandit ⋮ Multi-armed bandits in discrete and continuous time ⋮ Uncertainty in learning, choice, and visual fixation ⋮ K competing queues with geometric service requirements and linear costs: The \(\mu\) c-rule is always optimal ⋮ On scheduling influential stochastic tasks on a single machine ⋮ Matrices -- compensating the loss of anschauung ⋮ A General Theory of MultiArmed Bandit Processes with Constrained Arm Switches ⋮ Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges ⋮ Optimal stopping problems for multiarmed bandit processes with arms' independence

This page was built for publication: