Aggregation for Gaussian regression
From MaRDI portal
Abstract: This paper studies statistical aggregation procedures in the regression setting. A motivating factor is the existence of many different methods of estimation, leading to possibly competing estimators. We consider here three different types of aggregation: model selection (MS) aggregation, convex (C) aggregation and linear (L) aggregation. The objective of (MS) is to select the optimal single estimator from the list; that of (C) is to select the optimal convex combination of the given estimators; and that of (L) is to select the optimal linear combination of the given estimators. We are interested in evaluating the rates of convergence of the excess risks of the estimators obtained by these procedures. Our approach is motivated by recently published minimax results [Nemirovski, A. (2000). Topics in non-parametric statistics. Lectures on Probability Theory and Statistics (Saint-Flour, 1998). Lecture Notes in Math. 1738 85--277. Springer, Berlin; Tsybakov, A. B. (2003). Optimal rates of aggregation. Learning Theory and Kernel Machines. Lecture Notes in Artificial Intelligence 2777 303--313. Springer, Heidelberg]. There exist competing aggregation procedures achieving optimal convergence rates for each of the (MS), (C) and (L) cases separately. Since these procedures are not directly comparable with each other, we suggest an alternative solution. We prove that all three optimal rates, as well as those for the newly introduced (S) aggregation (subset selection), are nearly achieved via a single ``universal aggregation procedure. The procedure consists of mixing the initial estimators with weights obtained by penalized least squares. Two different penalties are considered: one of them is of the BIC type, the second one is a data-dependent -type penalty.
Recommendations
Cites work
- scientific article; zbMATH DE number 1522808 (Why is no real title available?)
- scientific article; zbMATH DE number 845714 (Why is no real title available?)
- scientific article; zbMATH DE number 893887 (Why is no real title available?)
- A distribution-free theory of nonparametric regression
- A new look at the statistical model identification
- Adaptive Regression by Mixing
- Adaptive estimation with soft thresholding penalties
- Adaptive model selection using empirical complexities
- Aggregated estimators and empirical complexity for least square regression
- Aggregating regression procedures to improve performance
- Atomic decomposition by basis pursuit
- Combining different procedures for adaptive regression
- Consistent covariate selection and post model selection inference in semiparametric regression.
- Estimating the dimension of a model
- Functional aggregation for nonparametric regression.
- Gaussian model selection
- Information Theory and Mixing Least-Squares Regressions
- Introduction to nonparametric estimation
- Learning Theory and Kernel Machines
- Learning by mirror averaging
- Least angle regression. (With discussion)
- Local Rademacher complexities and oracle inequalities in risk minimization. (2004 IMS Medallion Lecture). (With discussions and rejoinder)
- Model selection for regression on a fixed design
- Model selection for regression on a random design
- Model selection in nonparametric regression
- Model selection via testing: an alternative to (penalized) maximum likelihood estimators.
- Oracle inequalities for inverse problems
- Ordered linear smoothers
- Recursive aggregation of estimators by the mirror descent algorithm with averaging
- Regularization of Wavelet Approximations
- Risk bounds for model selection via penalization
- Sequential Procedures for Aggregating Arbitrary Estimators of a Conditional Mean
- Some Comments on C P
- Stable recovery of sparse overcomplete representations in the presence of noise
- Statistical learning theory and stochastic optimization. Ecole d'Eté de Probabilitiés de Saint-Flour XXXI -- 2001.
- The boosting approach to machine learning: an overview
- The risk inflation criterion for multiple regression
- Universal approximation bounds for superpositions of a sigmoidal function
- Wavelets, approximation, and statistical applications
Cited in
(only showing first 100 items - show all)- Aggregated wavelet estimation and its application to ultra-fast fMRI
- A nonlinear aggregation type classifier
- Combining a relaxed EM algorithm with Occam's razor for Bayesian variable selection in high-dimensional regression
- Least squares after model selection in high-dimensional sparse models
- Sign-constrained least squares estimation for high-dimensional regression
- Exponential screening and optimal rates of sparse estimation
- A unified framework for high-dimensional analysis of \(M\)-estimators with decomposable regularizers
- Pivotal estimation via square-root lasso in nonparametric regression
- Aggregation by exponential weighting, sharp PAC-Bayesian bounds and sparsity
- Model averaging by jackknife criterion in models with dependent data
- Honest variable selection in linear and logistic regression models via \(\ell _{1}\) and \(\ell _{1}+\ell _{2}\) penalization
- On the asymptotic properties of the group lasso estimator for linear models
- PAC-Bayesian bounds for sparse regression estimation with exponential weights
- Functional aggregation for nonparametric regression.
- \(\ell_1\)-penalized quantile regression in high-dimensional sparse models
- Mirror averaging with sparsity priors
- Learning by mirror averaging
- SLOPE is adaptive to unknown sparsity and asymptotically minimax
- Generalization of constraints for high dimensional regression problems
- Estimator selection with respect to Hellinger-type risks
- Near-ideal model selection by \(\ell _{1}\) minimization
- Hyper-sparse optimal aggregation
- Sparse regression learning by aggregation and Langevin Monte-Carlo
- Autoregressive process modeling via the Lasso procedure
- Performance of empirical risk minimization in linear aggregation
- On the optimality of the aggregate with exponential weights for low temperatures
- Adaptive Dantzig density estimation
- Oracle inequalities and optimal inference under group sparsity
- Classification of longitudinal data through a semiparametric mixed-effects model based on Lasso-type estimators
- Deviation optimal learning using greedy \(Q\)-aggregation
- Estimation of high-dimensional low-rank matrices
- Aggregation via empirical risk minimization
- Simultaneous analysis of Lasso and Dantzig selector
- Non-asymptotic oracle inequalities for the Lasso and group Lasso in high dimensional logistic model
- Some sharp performance bounds for least squares regression with \(L_1\) regularization
- Some theoretical results on the grouped variables Lasso
- Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators
- Best subset selection via a modern optimization lens
- Lasso-type recovery of sparse representations for high-dimensional data
- Prediction of time series by statistical learning: general losses and fast rates
- High-dimensional regression with unknown variance
- Anisotropic de-noising in functional deconvolution model with dimension-free convergence rates
- Optimal equivariant prediction for high-dimensional linear models with arbitrary predictor covariance
- Lasso and probabilistic inequalities for multivariate point processes
- Microlocal analysis of the geometric separation problem
- Estimator selection in the Gaussian setting
- Generalized mirror averaging and \(D\)-convex aggregation
- Sharp oracle inequalities for aggregation of affine estimators
- Empirical risk minimization is optimal for the convex aggregation problem
- From local kernel to nonlocal multiple-model image denoising
- A new approach to estimator selection
- Quasi-likelihood and/or robust estimation in high dimensions
- Sparse recovery under matrix uncertainty
- Adaptive estimation of the baseline hazard function in the Cox model by model selection, with high-dimensional covariates
- Kullback-Leibler aggregation and misspecified generalized linear models
- A universal procedure for aggregating estimators
- Mixing least-squares estimators when the variance is unknown
- Structured estimation for the nonparametric Cox model
- On the conditions used to prove oracle results for the Lasso
- Simultaneous adaptation to the margin and to complexity in classification
- Model selection in regression under structural constraints
- Sparse estimation by exponential weighting
- Deconvolution model with fractional Gaussian noise: a minimax study
- AIC for the Lasso in generalized linear models
- Laplace deconvolution with noisy observations
- Multichannel deconvolution with long-range dependence: a minimax study
- Sparse high-dimensional varying coefficient model: nonasymptotic minimax study
- Sparsity in penalized empirical risk minimization
- Estimation of matrices with row sparsity
- General oracle inequalities for model selection
- Transductive versions of the Lasso and the Dantzig selector
- SPADES and mixture models
- Optimal learning with \textit{Q}-aggregation
- Greedy algorithms for prediction
- Linear and convex aggregation of density estimators
- Optimal rates of aggregation in classification under low noise assumption
- Aggregating regression procedures to improve performance
- The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso)
- High-dimensional additive hazards models and the lasso
- Aggregating estimates by convex optimization
- Aggregated estimators and empirical complexity for least square regression
- Nonparametric sequential prediction of time series
- Estimation in nonparametric regression model with additive and multiplicative noise via Laguerre series
- Model aggregation for doubly divided data with large size and large dimension
- MAP model selection in Gaussian regression
- On the optimality of the empirical risk minimization procedure for the convex aggregation problem
- Adaptive estimation over anisotropic functional classes via oracle approach
- Lasso-type estimators for semiparametric nonlinear mixed-effects models estimation
- Optimal model selection in heteroscedastic regression using piecewise polynomial functions
- Non-parametric Poisson regression from independent and weakly dependent observations by model selection
- Sharp connections between Berry-Esseen characteristics and Edgeworth expansions for stationary processes
- Anisotropic functional deconvolution for the irregular design: A minimax study
- Network Estimation by Mixing: Adaptivity and More
- Robust forecast combinations
- Sparsity considerations for dependent variables
- The smooth-Lasso and other \(\ell _{1}+\ell _{2}\)-penalized methods
- Minimax adaptive wavelet estimator for the anisotropic functional deconvolution model with unknown kernel
- Blind deconvolution model in periodic setting with fractional Gaussian noise
- Sparse linear regression models of high dimensional covariates with non-Gaussian outliers and Berkson error-in-variable under heteroscedasticity
- Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices
This page was built for publication: Aggregation for Gaussian regression
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2456016)