Asymptotically efficient adaptive allocation rules
From MaRDI portal
Recommendations
Cites work
Cited in
(only showing first 100 items - show all)- Robustness of stochastic bandit policies
- Dynamic sampling allocation and design selection
- Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges
- Bounded Regret for Finitely Parameterized Multi-Armed Bandits
- Primal-dual algorithms for optimization with stochastic dominance
- scientific article; zbMATH DE number 7513920 (Why is no real title available?)
- Pure exploration in multi-armed bandits problems
- On the convergence rates of expected improvement methods
- Online learning for route planning with on-time arrival reliability
- Sequential design with applications to the trim-loss problem
- On the Prior Sensitivity of Thompson Sampling
- Matrices -- compensating the loss of anschauung
- scientific article; zbMATH DE number 410134 (Why is no real title available?)
- Stochastic approximation: from statistical origin to big-data, multidisciplinary applications
- Optimal control with learning on the fly: a toy problem
- Randomized allocation with nonparametric estimation for a multi-armed bandit problem with covariates
- scientific article; zbMATH DE number 7370545 (Why is no real title available?)
- Two-armed bandit problem for parallel data processing systems
- Reinforcement Learning, Bit by Bit
- Learning the distribution with largest mean: two bandit frameworks
- Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models
- Exploring search space trees using an adapted version of Monte Carlo tree search for combinatorial optimization problems
- Per-round knapsack-constrained linear submodular bandits
- Thompson sampling for networked control over unknown channels
- Adaptive Algorithm for Multi-Armed Bandit Problem with High-Dimensional Covariates
- Algorithm portfolios for noisy optimization
- On Monte Carlo tree search for weighted vertex coloring
- Branching time active inference: the theory and its generality
- Pure exploration in finitely-armed and continuous-armed bandits
- Online Debiasing for Adaptively Collected High-Dimensional Data With Applications to Time Series Analysis
- Nonstationary bandits with habituation and recovery dynamics
- Gittins Index for Simple Family of Markov Bandit Processes with Switching Cost and No Discounting
- Efficient Sorting in a Dynamic Adverse-Selection Model
- Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection
- Batched bandit problems
- Learning in combinatorial optimization: what and how to explore
- Certainty equivalence control with forcing: Revisited
- The multi-armed bandit problem: an efficient nonparametric solution
- The multi-armed bandit problem with covariates
- Functional feature construction for individualized treatment regimes
- Kullback-Leibler upper confidence bounds for optimal sequential allocation
- Response adaptive designs that incorporate switching costs and constraints
- Learning to optimize via information-directed sampling
- Optimal strategies for a class of sequential control problems with precedence relations
- Online collaborative filtering on graphs
- Multinomial Thompson sampling for rating scales and prior considerations for calibrating uncertainty
- Arbitrary side observations in bandit problems
- Close the gaps: a learning-while-doing algorithm for single-product revenue management problems
- Profile-based bandit with unknown profiles
- Reward maximization under uncertainty: leveraging side-observations on networks
- An asymptotically optimal strategy for constrained multi-armed bandit problems
- Asymptotic optimality for decentralised bandits
- Nonparametric bandit methods
- Optimal allocation of simulation experiments in discrete stochastic optimization and approximative algorithms
- Efficient allocations under ambiguity
- Small-Loss Bounds for Online Learning with Partial Information
- Generalized Bandit Problems
- Bandit and covariate processes, with finite or non-denumerable set of arms
- UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem
- Ballooning multi-armed bandits
- How fragile are information cascades?
- On Bayesian index policies for sequential resource allocation
- Adaptive policies for perimeter surveillance problems
- Boundary crossing probabilities for general exponential families
- An optimal bidimensional multi-armed bandit auction for multi-unit procurement
- Exploration-exploitation policies with almost sure, arbitrarily slow growing asymptotic regret
- Irreversible adaptive allocation rules
- Satisficing in Time-Sensitive Bandit Learning
- A reinforcement learning approach to personalized learning recommendation systems
- Asymptotic efficiency of a seqrential allocation rule
- Reading policies for joins: an asymptotic analysis
- Dynamic assortment personalization in high dimensions
- Daisee: Adaptive importance sampling by balancing exploration and exploitation
- Bandit Theory: Applications to Learning Healthcare Systems and Clinical Trials
- Bandit Change-Point Detection for Real-Time Monitoring High-Dimensional Data Under Sampling Control
- The \(K\)-armed dueling bandits problem
- On incomplete learning and certainty-equivalence control
- Modification of improved upper confidence bounds for regulating exploration in Monte-Carlo tree search
- Infomax strategies for an optimal balance between exploration and exploitation
- Adaptive matching for expert systems with uncertain task types
- Integrated online learning and adaptive control in queueing systems with uncertain payoffs
- Robust control of the multi-armed bandit problem
- Robust sequential design for piecewise-stationary multi-armed bandit problem in the presence of outliers
- Functional Sequential Treatment Allocation
- Improving multi-armed bandit algorithms in online pricing settings
- An asymptotically optimal policy for finite support models in the multiarmed bandit problem
- Distributed learning in congested environments with partial information
- A quality assuring, cost optimal multi-armed bandit mechanism for expertsourcing
- scientific article; zbMATH DE number 7370520 (Why is no real title available?)
- scientific article; zbMATH DE number 7370531 (Why is no real title available?)
- Adaptive enrichment designs for confirmatory trials
- An online algorithm for the risk-aware restless bandit
- scientific article; zbMATH DE number 7370524 (Why is no real title available?)
- A conversation with Tze Leung Lai
- Dynamic Inventory Control with Fixed Setup Costs and Unknown Discrete Demand Distribution
- Choosing a good toolkit. II: Bayes-rule based heuristics
- Game of thrones: fully distributed learning for multiplayer bandits
- Reward-modulated Hebbian learning of decision making
- scientific article; zbMATH DE number 7038557 (Why is no real title available?)
- Clustering in block Markov chains
This page was built for publication: Asymptotically efficient adaptive allocation rules
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1060517)