Bayesian linear regression for multivariate responses under group sparsity (Q2175004)

A Bayesian multivariate high-dimensional linear regression model is considered \[ Y_i=\sum_{j=1}^G X_{ij}\beta_j + \epsilon_i, \quad i=1,\dots,n, \quad G>1, \] where \(Y_i\) is a \(1\times d\) response variable, \(X_{ij}\) is a \(1\times p_j\) predictor variable, \(\beta_j\) is a \(p_j\times d\) matrix of regression coefficients, and \(\epsilon_1,\ldots,\epsilon_n\) are i.i.d. as \(N(0,~\Sigma)\) with an unknown covariance matrix \(\Sigma\). The group structure of the predictors is predetermined, and group sparsity is imposed on the predictors, i.e., many groups contain only zero coefficients and the remaining groups (called non-zero groups) can contain some zero coordinates. A product of independent spike-and-slab priors on the regression coefficients is chosen and a new prior on \(\Sigma\) based on its eigenvalue decomposition; each spike-and-slab prior is a mixture of a point mass at zero and a multivariate density involving the \(l_{2,1}\)-norm, which comprises of the \(l_2\)-norm put on the predictors within each group and the \(l_1\)-norm is put across the groups. To obtain asymptotic properties of estimation and selection, the following conditions are imposed: \(p=\sum _{j=1}^G p_j\) can grow at a rate faster than \(n,\) but the total number of the coefficients in all non-zero groups together are less than \(n\) in order; \(G\ge n^c,\) for some positive constant \(c,\) and \(\log G\) grows slower than \(n;\) finally, \(d^2\log n\) grows at a rate slower than \(n.\) The posterior contraction rate is obtained, as well as the bounds on the effective dimension of the model with high posterior probabilities. It is shown that the regression coefficients can be recovered under certain compatibility conditions. The paper quantifies the uncertainty for the coefficients with frequentist validity through a Bernstein-von Mises type theorem, and the selection consistency for the presented Bayesian method is proven. An MCMC algorithm is outlined to compute the posterior distributions.

0 references

Mathematics Subject Classification ID

62J05

0 references

0 references

0 references

0 references