The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression (Q2176606)

From MaRDI portal
scientific article
Language Label Description Also known as
English
The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression
scientific article

    Statements

    The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression (English)
    0 references
    0 references
    0 references
    5 May 2020
    0 references
    A high-dimensional logistic regression model is considered, where the binary response \(y\) is linked to the covariates \(x\in\mathbb{R}^p\) as follows: \[ \mathbb{P}(y=1|x)=\sigma(\beta_0+x^\top\beta)=1-\mathbb{P}(y=-1|x), \quad \sigma(t):=\frac{e^t}{1+e^t}. \] Here \(\beta_0\in\mathbb{R}\) and \(\beta\in\mathbb{R}^p\) are unknown regression parameters, and \(x\) has multivariate centered normal distribution with a nonsingular covariance matrix. \(n\) independent copies of the model are considered and couples \((x_i, y_i)\), \(i=1, \dots, n\), are observed. A sequence of problems is studied with \(n\to\infty\), \(p/n \to \kappa\) (assumed to be less than 1), \(\beta_0\) fixed and \(Var(x^\top\beta)\to\gamma_0^2.\) This is set so that the log-odds ratio \(\beta_0+x^\top\beta\) does not increase with \(n\) or \(p,\) so that the likelihood is not trivially equal either 0 or 1. The existence of the maximum likelihood estimate (MLE) in the underlying high-dimensional model is studied. An explicit function \(h_{MLE}= h_{MLE}(|\beta_0|, \gamma_0)\) is constructed such that for all \(\kappa> h_{MLE}\), it holds \(\mathbb{P}(MLE~ exists)\to 0\), and for all \(\kappa< h_{MLE}\), it holds \(\mathbb{P}(MLE~ exists)\to 1\). This profound statement about a sharp phase transition of the MLE is derived from ideas of convex geometry. In the case \(\beta_0=\gamma_0=0\), this statement implies the fundamental result of \textit{T. M. Cover} [IEEE Trans. Electron. Comput. 14, 326--334 (1965; Zbl 0152.18206)] concerning the separating capacities of decision surfaces, applied to logistic regression.
    0 references
    high-dimensional logistic regression
    0 references
    maximum likelihood estimate (MLE) phase transition
    0 references
    decision surface
    0 references
    multivariate centered normal distribution
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references