On the bias, risk, and consistency of sample means in multi-armed bandits
DOI10.1137/20M1361249zbMATH Open1476.62032arXiv1902.00746OpenAlexW3216378235MaRDI QIDQ5018902FDOQ5018902
Authors: Jaehyeok Shin, Aaditya Ramdas, Alessandro Rinaldo
Publication date: 27 December 2021
Published in: SIAM Journal on Mathematics of Data Science (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1902.00746
Recommendations
- Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits
- Active Learning in Multi-armed Bandits
- The sample complexity of exploration in the multi-armed bandit problem
- The multi-armed bandit problem: an efficient nonparametric solution
- Lower bounds on the sample complexity of exploration in the multi-armed bandit problem.
Asymptotic properties of nonparametric inference (62G20) Statistical aspects of big data and data science (62R07) Sampling theory, sample surveys (62D05)
Cites Work
- On the complexity of best-arm identification in multi-armed bandit models
- On confidence sequences
- Time-uniform, nonparametric, nonasymptotic confidence sequences
- CONFIDENCE SEQUENCES FOR MEAN, VARIANCE, AND MEDIAN
- Optimum Character of the Sequential Probability Ratio Test
- Introduction to nonparametric estimation
- Bandit algorithms
- Title not available (Why is that?)
- Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges
- Some aspects of the sequential design of experiments
- Title not available (Why is that?)
- Stopped Random Walks
- Title not available (Why is that?)
- Information-theoretic determination of minimax rates of convergence
- Estimation of densities and applications
- Self-normalized processes: exponential inequalities, moment bounds and iterated logarithm laws.
- Relative loss bounds for on-line density estimation with the exponential family of distributions
- On the Asymptotic Efficiency of a Sequential Procedure for Estimating the Mean
- Further Remarks on Sequential Estimation: The Exponential Case
- Probability
- \(L_p\)-version of the Dubins-Savage inequality and some exponential inequalities
- Estimation Following Sequential Tests
- SOME FURTHER REMARKS ON INEQUALITIES FOR SAMPLE SUMS
- Time-uniform Chernoff bounds via nonnegative supermartingales
- Title not available (Why is that?)
Cited In (1)
This page was built for publication: On the bias, risk, and consistency of sample means in multi-armed bandits
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5018902)