Maximum likelihood estimation with partially censored data (Q1896243)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Maximum likelihood estimation with partially censored data
scientific article

    Statements

    Maximum likelihood estimation with partially censored data (English)
    0 references
    13 February 1996
    0 references
    If one observes a sample of independent, identically distributed random elements \(Z_1, \dots, Z_n\) from a completely unknown probability distribution \(\eta\), then the usual estimator for \(\eta\) is the empirical distribution \(\widehat {\eta}= n^{-1} \sum_{j=1}^n \delta_{Z_j}\). Consider the situation wherein the observed \(Z_1, \dots, Z_n\) are actually part of a larger number \(m+n\) of replications of some experiment. Unfortunately, \(m\) out of the \(m+n\) times the \(Z\)-value is not observed, but instead one gets to see \(X\) which conditionally on \(Z=z\) has a known density \(p(x\mid z)\) with respect to a fixed measure \(\mu\). Hence the total set of observations is \(X_1, \dots, X_m\), \(Z_1, \dots, Z_n\); all observations are independent and their joint distribution can formally be written as \[ \prod_{i=1}^m \int p(x_i \mid y) d\eta (y) \prod_{j=1}^n d\eta (z_j). \] (The first factor in the product is a density with respect to \(\mu^n\); the second factor is just formal notation.) In this situation the set \(Z_1, \dots, Z_n\) clearly contains much more information about \(\eta\) than the set \(X_1, \dots, X_m\). Nevertheless, one would certainly want to take the information available in \(X_1, \dots, X_m\) into account and obtain an improved estimator for \(\eta\) relative to using \(\widehat {\eta}\), the empirical distribution of the second sample. Surprisingly enough there may be a considerable gain in using \(X_1, \dots, X_m\) even in situations where the information (in the technical sense of semiparametric theory) in \(X_1, \dots, X_m\) alone is 0 and \(\sqrt {n}\)-consistent estimators based on the first sample alone do not exist. For \(m=n\) use of the additional sample always becomes visible as a cut in the asymptotic variance of the estimator. It is thus of interest to study estimators for \(\eta\) based on the whole set of observations. We show under some smoothness conditions that the maximum likelihood estimator for \(\eta\) attains a \(\sqrt {n}\)-rate and is asymptotically efficient for estimating \(\eta\) in the semiparametric sense.
    0 references
    0 references
    asymptotic normality
    0 references
    mixture model
    0 references
    deconvolution
    0 references
    censoring
    0 references
    asymptotic efficiency
    0 references
    Donsker classes
    0 references
    smooth and nonsmooth kernels
    0 references
    empirical distribution
    0 references
    maximum likelihood estimator
    0 references
    semiparametric
    0 references
    0 references