Hyper nonlocal priors for variable selection in generalized linear models (Q1987723)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Hyper nonlocal priors for variable selection in generalized linear models |
scientific article |
Statements
Hyper nonlocal priors for variable selection in generalized linear models (English)
0 references
15 April 2020
0 references
This paper proposes a new Bayesian method of variable selection in generalized linear models. This is an important issue in any scientific discipline that uses such models in the analysis of experimental data. The following is a summary of the technical aspects of the paper. Let \(Y=(Y_,\dots,Y_{n})'\) be an \(n\)-dimensional random vector (the response variable). Let \[ X=[x_{\cdot,1},\dots,x_{\cdot,p}] \] be a numerical \(n\times p\) matrix, design matrix, where \(n\) is the number of observations and \(p\) is the total number of variables to be considered. \(j=(j_1,\dots,j_{p})\in\{0,1\}^{p}\) indicates the model \[ h(E(Y))=\beta_{0,j}+X_{j} \cdot \beta_{j} \] where \(h\) is a given link function, \(\beta_{0,j}\) is the intercept, \(X_{j}\) is the design matrix consisting only of the \(x_{\cdot,j_{k}}\), \(j_{k}=1\), \(k=1,\dots,p\) variables, and \(\beta\) is the corresponding regression vector of nonzero coefficients whose dimension is \(p_{j}\), number of ones in vector \(j\), so \(X_{j}\) is an \(n\times p_{j}\) matrix. Be \(M=\{j\vert j=(j_1,\dots,j_{p})\}\in\{0,1\}^{p}\). For each \(j\in M\), \(\beta_{0,j}\) and \(\beta_{j}\) are unknown parameters to be estimated from observations of the response variable. For each \(j\in M\), it is assumed that the random vector \(Y\) has a density given for each \(y=(y_1,\dots,y_{n})'\in\mathbb{R}^n\) by \[ f(y)=\prod_{i=1}^n\exp(((y_{i}\theta(\eta_{i,j})-b(\theta(η_{i,j})))/\varphi)+c(y_{i},\varphi)), \] where \(\theta\), \(b\), \(c\) are known functions, \[ \eta_{i,j}=h(E(Y_{i}))=\beta_{0,i,j}+X_{i,j} \cdot \beta_{j}, \] \(\eta_{j}=(\eta_{1,j},\dots,\eta_{n,j})'\) is the linear predictor vector, \(\theta(\eta_{i,j})\) is the canonical parameter for \(Y_{i}\), and \(\varphi\) is the dispersion parameter. Furthermore: \(E(Y_{i})=Db(\theta(\eta_{i,j}))\), Var\((Y_{i})=\varphi D^2b(\theta(\eta_{i,j}))\) where \(D\) is the derivative operator, and there are all the higher-order derivatives. The problem of choosing the best model in the \(M\)-set to predict the response variable vector is addressed in this paper. Given a method of variable selection, the following hypothesis test is performed for each \(j\in M:H_0\) (null hypothesis): the best model is the one indicated by \(j\) versus \(H_1\) (alternative hypothesis): the model indicated by \(j\) is not the best model in \(M\). Four new Bayesian methods of variable selection are proposed by the authors, by assigning prior distributions for \(\beta_{0,i,j}\) and \(\theta_{j}\), assuming that the dispersion parameter \(\varphi\) is known. All four methods are of the non-local priori class. The concept of non-local priori is briefly explained below. In a two-hypothesis test problem in parametric statistics suppose that \(\Theta\) is the parameter space, \(\Theta_0\) the subspace corresponding to the null hypothesis and \(\Theta_1\) that of the alternative hypothesis. From a Bayesian point of view it is required to specify the probability densities of the parameters under both hypotheses, be those densities \(\pi_0\) for the null hypothesis and \(\pi_1\) for the alternative. It is said that both distributions are non-local if \(\pi_0(\theta)=0\) if \(\theta\in\Theta_1\) and \(\pi_1(\theta)=0\) if \(\theta\in\Theta_0\). The main results of this paper demonstrate, under certain conditions of regularity, the rates of convergence of Bayes' factors based on the proposed priorities. Also, the authors show the superiority of the proposed methods in terms of convergence rates in comparison with those obtained on a local priori basis. They comment on what the main proven results mean in practice. This helps to better understand those results. A section of the document analyses the performance of the proposed methods in three simulation studies but only details one. In another section, the results obtained when applying the proposed methods in an example with real data are presented. In my opinion, the authors should make available free of charge the supplementary material to which they refer in the paper. Although the subject of the paper is not a simple issue in statistics, it can be read without much difficulty without being a professional statistician. Particularly if the technicalities of proofs of the main theoretical results are omitted. A brief section of the paper clearly summarizes the proposed variable selection techniques as well as their advantages over other already known techniques. The list of bibliographical references is extensive and updated.
0 references
Bayesian variable selection
0 references
generalized linear model
0 references
nonlocal prior
0 references
scale mixtures
0 references
variable selection consistency
0 references
0 references
0 references
0 references
0 references
0 references
0 references
0 references
0 references