Doubly penalized estimation in additive regression with high-dimensional data (Q2328052)

From MaRDI portal





scientific article
Language Label Description Also known as
default for all languages
No label defined
    English
    Doubly penalized estimation in additive regression with high-dimensional data
    scientific article

      Statements

      Doubly penalized estimation in additive regression with high-dimensional data (English)
      0 references
      0 references
      9 October 2019
      0 references
      The authors consider high-dimensional nonparametric additive regression: given \((X_1,Y_1),\ldots,(X_n,Y_n)\), independent observations, where each \(Y_i\in\mathbb{R}\) is a response variable and each \(X_i\in\mathbb{R}^d\) is a vector of covariates, consider the model \(Y_i=g^*(X_i)+\varepsilon_i\), where \[ g^*(x)=\sum_{j=1}^pg_j^*\left(x^{(j)}\right)\,, \] each \(\varepsilon_i\) is a noise term, and, for each \(j\), \(x^{(j)}\) is a vector formed of a (small) subset of the components of \(x\in\mathbb{R}^d\), which may possibly be overlapping, with \(p\) possibly larger than \(n\). The class of estimators of \(g^*\) studied have two penalty components: one using an empirical \(L_2\) norm to induce sparsity of the estimator, and another using functional semi-norms to induce smoothness. The main results of the paper are oracle inequalities for predictive performance in this setting, giving upper bounds on the penalized predictive loss for both fixed and random designs. In the fixed design setting, new observations are drawn with covariates from the sample \((X_1,\ldots,X_n)\), whereas the random design setting has covariates drawn from the distributions of \((X_1,\ldots,X_n)\). These oracle inequalities are established under assumptions of sub-Gaussian tails for the noise, an entropy condition on the relevant functional classes, and an empirical compatibility condition. In the setting of random designs, this sample compatibility condition may be replaced by a population compatibility condition and a condition ensuring convergence of empirical norms. Compared to existing results in the literature, these conditions are weaker and the resulting inequalities give better rates of convergence. The framework is flexible, in that it allows for a decoupling of sparsity and smoothness conditions. The authors consider the special cases of Sobolev and bounded variation spaces (where explicit rates of convergence obtained in the oracle inequalities are shown to match minimax lower bounds), and also give results on convergence of empirical norms that may be of independent interest.
      0 references
      0 references
      additive model
      0 references
      bounded variation space
      0 references
      ANOVA model
      0 references
      high-dimensional data
      0 references
      metric entropy
      0 references
      penalized estimation
      0 references
      reproducing kernel Hilbert space
      0 references
      Sobolev space
      0 references
      total variation
      0 references
      trend filtering
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references

      Identifiers

      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references