Local Rademacher complexities and oracle inequalities in risk minimization. (2004 IMS Medallion Lecture). (With discussions and rejoinder)
From MaRDI portal
Publication:2373576
Nonparametric regression and quantile regression (62G08) Classification and discrimination; cluster analysis (statistical aspects) (62H30) Learning and adaptive systems in artificial intelligence (68T05) Pattern recognition, speech recognition (68T10) Computational learning theory (68Q32) Probability theory on algebraic and topological structures (60B99)
Abstract: Let be a class of measurable functions defined on a probability space . Given a sample (X_1,...,X_n) of i.i.d. random variables taking values in S with common distribution P, let P_n denote the empirical measure based on (X_1,...,X_n). We study an empirical risk minimization problem , . Given a solution of this problem, the goal is to obtain very general upper bounds on its excess risk [mathcal{E}_P(hat{f}_n):=Phat{f}_n-inf_{fin mathcal{F}}Pf,] expressed in terms of relevant geometric parameters of the class . Using concentration inequalities and other empirical processes tools, we obtain both distribution-dependent and data-dependent upper bounds on the excess risk that are of asymptotically correct order in many examples. The bounds involve localized sup-norms of empirical and Rademacher processes indexed by functions from the class. We use these bounds to develop model selection techniques in abstract risk minimization problems that can be applied to more specialized frameworks of regression and classification.
Recommendations
Cites Work
- scientific article; zbMATH DE number 2089352 (Why is no real title available?)
- scientific article; zbMATH DE number 2089354 (Why is no real title available?)
- scientific article; zbMATH DE number 5654889 (Why is no real title available?)
- scientific article; zbMATH DE number 49190 (Why is no real title available?)
- scientific article; zbMATH DE number 1332320 (Why is no real title available?)
- scientific article; zbMATH DE number 1064642 (Why is no real title available?)
- scientific article; zbMATH DE number 2034518 (Why is no real title available?)
- scientific article; zbMATH DE number 1552503 (Why is no real title available?)
- scientific article; zbMATH DE number 3446442 (Why is no real title available?)
- scientific article; zbMATH DE number 893887 (Why is no real title available?)
- 10.1162/1532443041424319
- A Bennett concentration inequality and its application to suprema of empirical processes
- A distribution-free theory of nonparametric regression
- A new look at independence
- A sharp concentration inequality with applications
- An empirical process approach to the uniform consistency of kernel-type function estimators
- Bounding the generalization error of convex combinations of classifiers: Balancing the dimensionality and the margins.
- Complexities of convex combinations and bounding the generalization error in classification
- Complexity regularization via localized random penalties
- Concentration inequalities and asymptotic results for ratio type empirical processes
- Consistency of Support Vector Machines and Other Regularized Kernel Classifiers
- Convergence rate of sieve estimates
- Convexity, Classification, and Risk Bounds
- Efficient agnostic learning of neural networks with bounded fan-in
- Empirical margin distributions and bounding the generalization error of combined classifiers
- Empirical minimization
- Improving the sample complexity using global data
- Inequalities for uniform deviations of averages from expectations with applications to nonparametric regression
- Left concentration inequalities for empirical processes
- Local Rademacher complexities
- Model selection and error estimation
- Model selection for regression on a random design
- Moment inequalities for functions of independent random variables
- Neural Network Learning
- New concentration inequalities in product spaces
- On consistency of kernel density estimators for randomly censored data: Rates holding uniformly over adaptive intervals
- On the Bayes-risk consistency of regularized boosting methods.
- Optimal aggregation of classifiers in statistical learning.
- Oracle inequalities and nonparametric function estimation
- Rademacher penalties and structural risk minimization
- Risk bounds for model selection via penalization
- Sharper bounds for Gaussian and empirical processes
- Smooth discrimination analysis
- Some applications of concentration inequalities to statistics
- Some limit theorems for empirical processes (with discussion)
- Square root penalty: Adaption to the margin in classification and in edge estimation
- Statistical behavior and consistency of classification methods based on convex risk minimization.
- Statistical performance of support vector machines
- Uniform Central Limit Theorems
- Weak convergence and empirical processes. With applications to statistics
Cited In (only showing first 100 items - show all)
- Convergence rates for shallow neural networks learned by gradient descent
- Nonparametric regression using deep neural networks with ReLU activation function
- Aggregation of estimators and stochastic optimization
- Localized Gaussian width of \(M\)-convex hulls with applications to Lasso and convex aggregation
- Suboptimality of constrained least squares and improvements via non-linear predictors
- Title not available (Why is no real title available?)
- Title not available (Why is no real title available?)
- Title not available (Why is no real title available?)
- Relative deviation learning bounds and generalization with unbounded loss functions
- On least squares estimation under heteroscedastic and heavy-tailed errors
- Sample average approximation with heavier tails. I: Non-asymptotic bounds with weak assumptions and stochastic constraints
- Performance guarantees for policy learning
- Convergence rates for empirical barycenters in metric spaces: curvature, convexity and extendable geodesics
- Optimal linear discriminators for the discrete choice model in growing dimensions
- Sample average approximation with heavier tails II: localization in stochastic convex optimization and persistence results for the Lasso
- Local Rademacher complexity-based learning guarantees for multi-task learning
- Concentration inequalities for two-sample rank processes with application to bipartite ranking
- Title not available (Why is no real title available?)
- Measuring the capacity of sets of functions in the analysis of ERM
- Robust multicategory support vector machines using difference convex algorithm
- Wild bootstrap inference for penalized quantile regression for longitudinal data
- ERM and RERM are optimal estimators for regression problems when malicious outliers corrupt the labels
- Estimation bounds and sharp oracle inequalities of regularized procedures with Lipschitz loss functions
- Empirical variance minimization with applications in variance reduction and optimal control
- Locally simultaneous inference
- Statistical inference using regularized M-estimation in the reproducing kernel Hilbert space for handling missing data
- Minimax adaptive dimension reduction for regression
- Noisy discriminant analysis with boundary assumptions
- Sample average approximation for stochastic programming with equality constraints
- Measuring distributional asymmetry with Wasserstein distance and Rademacher symmetrization
- Joint regression analysis of mixed-type outcome data via efficient scores
- Surrogate losses in passive and active learning
- Concentration bounds for the empirical angular measure with statistical learning applications
- Optimal robust mean and location estimation via convex programs with respect to any pseudo-norms
- Robust statistical learning with Lipschitz and convex loss functions
- On the minimax optimality and superiority of deep neural network learning over sparse parameter spaces
- Mass volume curves and anomaly ranking
- Fast generalization error bound of deep learning without scale invariance of activation functions
- Set structured global empirical risk minimizers are rate optimal in general dimensions
- Complexity versus agreement for many views. Co-regularization for multi-view semi-supervised learning
- Fast rates for general unbounded loss functions: from ERM to generalized Bayes
- Complex sampling designs: uniform limit theorems and applications
- Inference on covariance operators via concentration inequalities: \(k\)-sample tests, classification, and clustering via Rademacher complexities
- Title not available (Why is no real title available?)
- Solving PDEs on spheres with physics-informed convolutional neural networks
- Bandwidth selection in kernel empirical risk minimization via the gradient
- A moment-matching approach to testable learning and a new characterization of Rademacher complexity
- Discussion of ``On concentration for (regularized) empirical risk minimization by Sara van de Geer and Martin Wainwright
- Deep learning: a statistical viewpoint
- Multiplier \(U\)-processes: sharp bounds and applications
- On the optimality of the empirical risk minimization procedure for the convex aggregation problem
- Optimal upper and lower bounds for the true and empirical excess risks in heteroscedastic least-squares regression
- Optimal model selection in heteroscedastic regression using piecewise polynomial functions
- Gibbs posterior concentration rates under sub-exponential type losses
- Nonasymptotic analysis of robust regression with modified Huber's loss
- From Gauss to Kolmogorov: localized measures of complexity for ellipses
- Robust supervised learning with coordinate gradient descent
- On the optimality of sample-based estimates of the expectation of the empirical minimizer
- Title not available (Why is no real title available?)
- Concentration inequalities for samples without replacement
- A universal procedure for aggregating estimators
- Bayesian fractional posteriors
- Learning Theory
- Rademacher penalties and structural risk minimization
- Parametric or nonparametric? A parametricness index for model selection
- Oracle inequalities in empirical risk minimization and sparse recovery problems. École d'Été de Probabilités de Saint-Flour XXXVIII-2008.
- Fast learning rate of non-sparse multiple kernel learning and optimal regularization strategies
- Local learning estimates by integral operators
- Fast learning rates in statistical inference through aggregation
- Sampling and empirical risk minimization
- Rho-estimators revisited: general theory and applications
- Tests and estimation strategies associated to some loss functions
- Tikhonov, Ivanov and Morozov regularization for support vector machine learning
- A no-free-lunch theorem for multitask learning
- Model selection by resampling penalization
- A high-dimensional Wilks phenomenon
- Empirical minimization
- Oracle inequalities for cross-validation type procedures
- A new method for estimation and model selection: \(\rho\)-estimation
- Singularity, misspecification and the convergence rate of EM
- Convergence rates of least squares regression estimators with heavy-tailed errors
- Square root penalty: Adaption to the margin in classification and in edge estimation
- Adaptive estimation of a distribution function and its density in sup-norm loss by wavelet and spline projections
- Sharper lower bounds on the performance of the empirical risk minimization algorithm
- Global uniform risk bounds for wavelet deconvolution estimators
- Risk bounds for CART classifiers under a margin condition
- Margin-adaptive model selection in statistical learning
- Honest confidence sets in nonparametric IV regression and other ill-posed models
- A local Vapnik-Chervonenkis complexity
- Nonasymptotic bounds for vector quantization in Hilbert spaces
- Aggregation for Gaussian regression
- Fast learning rates for plug-in classifiers
- Local Rademacher complexities
- Random design analysis of ridge regression
- Classifiers of support vector machine type with \(\ell_1\) complexity regularization
- Rademacher complexity for Markov chains: applications to kernel smoothing and Metropolis-Hastings
- An elementary analysis of ridge regression with random design
- Compressive statistical learning with random feature moments
- Concentration inequalities and confidence bands for needlet density estimators on compact homogeneous manifolds
- Title not available (Why is no real title available?)
This page was built for publication: Local Rademacher complexities and oracle inequalities in risk minimization. (2004 IMS Medallion Lecture). (With discussions and rejoinder)
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2373576)