Empirical entropy, minimax regret and minimax risk
From MaRDI portal
Publication:520672
DOI10.3150/14-BEJ679zbMATH Open1380.62176arXiv1308.1147OpenAlexW2111396966MaRDI QIDQ520672FDOQ520672
Authors: Alexander Rakhlin, Karthik Sridharan, Alexandre B. Tsybakov
Publication date: 5 April 2017
Published in: Bernoulli (Search for Journal in Brave)
Abstract: We consider the random design regression model with square loss. We propose a method that aggregates empirical minimizers (ERM) over appropriately chosen random subsets and reduces to ERM in the extreme case, and we establish sharp oracle inequalities for its risk. We show that, under the growth of the empirical -entropy, the excess risk of the proposed method attains the rate for and for where is the sample size. Furthermore, for , the excess risk rate matches the behavior of the minimax risk of function estimation in regression problems under the well-specified model. This yields a conclusion that the rates of statistical estimation in well-specified models (minimax risk) and in misspecified models (minimax regret) are equivalent in the regime . In other words, for the problem of statistical learning enjoys the same minimax rate as the problem of statistical estimation. On the contrary, for we show that the rates of the minimax regret are, in general, slower than for the minimax risk. Our oracle inequalities also imply the rates for Vapnik-Chervonenkis type classes of dimension without the usual convexity assumption on the class; we show that these rates are optimal. Finally, for a slightly modified method, we derive a bound on the excess risk of -sparse convex aggregation improving that of Lounici [Math. Methods Statist. 16 (2007) 246-259] and providing the optimal rate.
Full work available at URL: https://arxiv.org/abs/1308.1147
Recommendations
- Empirical risk minimization is optimal for the convex aggregation problem
- Fast learning rates in statistical inference through aggregation
- Sparsity in penalized empirical risk minimization
- Learning Theory and Kernel Machines
- On the optimality of the empirical risk minimization procedure for the convex aggregation problem
Nonparametric regression and quantile regression (62G08) Minimax procedures in statistical decision theory (62C20)
Cited In (20)
- Bayesian fractional posteriors
- Bounding the expectation of the supremum of empirical processes indexed by Hölder classes
- Orthogonal statistical learning
- Suboptimality of constrained least squares and improvements via non-linear predictors
- Title not available (Why is that?)
- Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits Under Realizability
- On least squares estimation under heteroscedastic and heavy-tailed errors
- Convergence rates for empirical barycenters in metric spaces: curvature, convexity and extendable geodesics
- Isotonic regression with unknown permutations: statistics, computation and adaptation
- Distribution-free robust linear regression
- Empirical variance minimization with applications in variance reduction and optimal control
- Isotonic regression in general dimensions
- Title not available (Why is that?)
- Minimax rates for conditional density estimation via empirical entropy
- Relaxing the i.i.d. assumption: adaptively minimax optimal regret via root-entropic regularization
- Localization of VC classes: beyond local Rademacher complexities
- Optimal functional supervised classification with separation condition
- Title not available (Why is that?)
- Deep learning: a statistical viewpoint
- Entropy Maximization for Partially Observable Markov Decision Processes
This page was built for publication: Empirical entropy, minimax regret and minimax risk
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q520672)