Divergence-based estimation and testing of statistical models of classification (Q1898412)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Divergence-based estimation and testing of statistical models of classification |
scientific article |
Statements
Divergence-based estimation and testing of statistical models of classification (English)
0 references
1 September 1996
0 references
A frequent problem of categorical data analysis is that a fixed number \(n\) of samples \(X = (X_1, \dots, X_n) \in {\mathcal X}^n\) is taken from each of \(N\) different populations (families of individuals, clusters of objects). The sample space \(\mathcal X\) is classified into \(r\) categories by a rule \(\rho : {\mathcal X} \to \{1, \dots, r\}\). Let \(Y = (Y_1,\dots, Y_r)\) be the classification vector with the components representing counts of the respective categories in the sample vector \(X\); i.e., let \[ Y_j = \#\{1 \leq k \leq n: \rho(X_k) = j\},\quad 1 \leq j\leq r. \] The sample space of the vector \(Y\) is denoted by \(S_{n,r}\); i.e., \[ S_{n,r} = \{y = (y_1,\dots, y_r) \in \{0,1,\dots, n\}^r : y_1 + \cdots + y_r = n\}. \] Populations \(i =1,\dots, N\) generate different sampel vectors \(X^{(i)}\) and the corresponding classification vectors \(Y^{(i)}\). The sampled populations are assumed to be independent and homogeneous in the sense that \(X^{(i)}\), and consequently \(Y^{(i)}\), are independent realizations of the above considered \(X\) and \(Y\). The i.i.d. property of the components \(X_1,\dots, X_n\) is included as a special case. The aim of this paper is to present an extended class of methods for estimating parameters of statistical models of vectors \(Y\) and for testing statistical hypotheses about these models. Our methods are based on the so-called \(\phi\)-divergences of probability distributions. They include as particular cases the well-known maximum likelihood method of estimation and Pearson's \(X^2\)-method of testing. Asymptotic properties of estimators minimizing \(\phi\)-divergence between theoretical and empirical vectors of means are established. Asymptotic distributions of \(\phi\)-divergences between empirical and estimated vectors of means are explicitly evaluated, and tests based on these statistics are studied.
0 references
phi divergence
0 references
classification
0 references
clustered data
0 references
minimum divergence estimation
0 references
minimum divergence testing
0 references
optimality of testing
0 references
categorical data analysis
0 references
maximum likelihood
0 references