Maximum likelihood estimation with partially censored data (Q1896243)

scientific article

Language	Label	Description	Also known as
English	Maximum likelihood estimation with partially censored data	scientific article

Statements

instance of

scholarly article

0 references

title

Maximum likelihood estimation with partially censored data (English)

0 references

published in

The Annals of Statistics

0 references

publication date

13 February 1996

0 references

review text

If one observes a sample of independent, identically distributed random elements \(Z_1, \dots, Z_n\) from a completely unknown probability distribution \(\eta\), then the usual estimator for \(\eta\) is the empirical distribution \(\widehat {\eta}= n^{-1} \sum_{j=1}^n \delta_{Z_j}\). Consider the situation wherein the observed \(Z_1, \dots, Z_n\) are actually part of a larger number \(m+n\) of replications of some experiment. Unfortunately, \(m\) out of the \(m+n\) times the \(Z\)-value is not observed, but instead one gets to see \(X\) which conditionally on \(Z=z\) has a known density \(p(x\mid z)\) with respect to a fixed measure \(\mu\). Hence the total set of observations is \(X_1, \dots, X_m\), \(Z_1, \dots, Z_n\); all observations are independent and their joint distribution can formally be written as \[ \prod_{i=1}^m \int p(x_i \mid y) d\eta (y) \prod_{j=1}^n d\eta (z_j). \] (The first factor in the product is a density with respect to \(\mu^n\); the second factor is just formal notation.) In this situation the set \(Z_1, \dots, Z_n\) clearly contains much more information about \(\eta\) than the set \(X_1, \dots, X_m\). Nevertheless, one would certainly want to take the information available in \(X_1, \dots, X_m\) into account and obtain an improved estimator for \(\eta\) relative to using \(\widehat {\eta}\), the empirical distribution of the second sample. Surprisingly enough there may be a considerable gain in using \(X_1, \dots, X_m\) even in situations where the information (in the technical sense of semiparametric theory) in \(X_1, \dots, X_m\) alone is 0 and \(\sqrt {n}\)-consistent estimators based on the first sample alone do not exist. For \(m=n\) use of the additional sample always becomes visible as a cut in the asymptotic variance of the estimator. It is thus of interest to study estimators for \(\eta\) based on the whole set of observations. We show under some smoothness conditions that the maximum likelihood estimator for \(\eta\) attains a \(\sqrt {n}\)-rate and is asymptotically efficient for estimating \(\eta\) in the semiparametric sense.

0 references

zbMATH Keywords

asymptotic normality

0 references

mixture model

0 references

deconvolution

0 references

censoring

0 references

asymptotic efficiency

0 references