Optimal subsampling for softmax regression (Q2423180)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Optimal subsampling for softmax regression |
scientific article |
Statements
Optimal subsampling for softmax regression (English)
0 references
21 June 2019
0 references
To draw inferences from extremely massive data sets when there is a restriction on computing time, subsampling methods are profitably used during the last couple of decades. While the earlier studies concentrated on linear regression, \textit{H. Wang} et al. [J. Am. Stat. Assoc. 114, No. 525, 393--405 (2019; Zbl 1478.62196)] considered logistic regression set up and drew a parallel from optimum experimental designs and thus developed optimum subsampling methods. In this paper, the present authors consider the multinomial logistic regression model for multiple classification data. Optimum subsampling probabilities for this model are derived under A-optimality, defined as minimization of average of the variances for all parametric components as well as L-optimality, referring to the minimization of trace of the variance covariance matrix for some linear transformation L of the estimate of the parameter. A two stage adaptive algorithm is provided which gives a clear understanding of the procedure. To evaluate the efficiency of the algorithm under various conditions, a simulation study is given at the end. An appendix gives the detailed proofs of the two main theorems on asymptotic normality and optimum subsampling probabilities.
0 references
large data sets
0 references
subsampling
0 references
A-optimality criterion
0 references
L-optimality criterion
0 references
multinomial logistic regression model
0 references
Softmax regression
0 references
optimum experimental designs
0 references
covariance matrix
0 references
0 references
0 references