The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression (Q2176606)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression |
scientific article |
Statements
The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression (English)
0 references
5 May 2020
0 references
A high-dimensional logistic regression model is considered, where the binary response \(y\) is linked to the covariates \(x\in\mathbb{R}^p\) as follows: \[ \mathbb{P}(y=1|x)=\sigma(\beta_0+x^\top\beta)=1-\mathbb{P}(y=-1|x), \quad \sigma(t):=\frac{e^t}{1+e^t}. \] Here \(\beta_0\in\mathbb{R}\) and \(\beta\in\mathbb{R}^p\) are unknown regression parameters, and \(x\) has multivariate centered normal distribution with a nonsingular covariance matrix. \(n\) independent copies of the model are considered and couples \((x_i, y_i)\), \(i=1, \dots, n\), are observed. A sequence of problems is studied with \(n\to\infty\), \(p/n \to \kappa\) (assumed to be less than 1), \(\beta_0\) fixed and \(Var(x^\top\beta)\to\gamma_0^2.\) This is set so that the log-odds ratio \(\beta_0+x^\top\beta\) does not increase with \(n\) or \(p,\) so that the likelihood is not trivially equal either 0 or 1. The existence of the maximum likelihood estimate (MLE) in the underlying high-dimensional model is studied. An explicit function \(h_{MLE}= h_{MLE}(|\beta_0|, \gamma_0)\) is constructed such that for all \(\kappa> h_{MLE}\), it holds \(\mathbb{P}(MLE~ exists)\to 0\), and for all \(\kappa< h_{MLE}\), it holds \(\mathbb{P}(MLE~ exists)\to 1\). This profound statement about a sharp phase transition of the MLE is derived from ideas of convex geometry. In the case \(\beta_0=\gamma_0=0\), this statement implies the fundamental result of \textit{T. M. Cover} [IEEE Trans. Electron. Comput. 14, 326--334 (1965; Zbl 0152.18206)] concerning the separating capacities of decision surfaces, applied to logistic regression.
0 references
high-dimensional logistic regression
0 references
maximum likelihood estimate (MLE) phase transition
0 references
decision surface
0 references
multivariate centered normal distribution
0 references
0 references