Maximum likelihood estimation with partially censored data (Q1896243)

From MaRDI portal





scientific article; zbMATH DE number 788291
Language Label Description Also known as
default for all languages
No label defined
    English
    Maximum likelihood estimation with partially censored data
    scientific article; zbMATH DE number 788291

      Statements

      Maximum likelihood estimation with partially censored data (English)
      0 references
      13 February 1996
      0 references
      If one observes a sample of independent, identically distributed random elements \(Z_1, \dots, Z_n\) from a completely unknown probability distribution \(\eta\), then the usual estimator for \(\eta\) is the empirical distribution \(\widehat {\eta}= n^{-1} \sum_{j=1}^n \delta_{Z_j}\). Consider the situation wherein the observed \(Z_1, \dots, Z_n\) are actually part of a larger number \(m+n\) of replications of some experiment. Unfortunately, \(m\) out of the \(m+n\) times the \(Z\)-value is not observed, but instead one gets to see \(X\) which conditionally on \(Z=z\) has a known density \(p(x\mid z)\) with respect to a fixed measure \(\mu\). Hence the total set of observations is \(X_1, \dots, X_m\), \(Z_1, \dots, Z_n\); all observations are independent and their joint distribution can formally be written as \[ \prod_{i=1}^m \int p(x_i \mid y) d\eta (y) \prod_{j=1}^n d\eta (z_j). \] (The first factor in the product is a density with respect to \(\mu^n\); the second factor is just formal notation.) In this situation the set \(Z_1, \dots, Z_n\) clearly contains much more information about \(\eta\) than the set \(X_1, \dots, X_m\). Nevertheless, one would certainly want to take the information available in \(X_1, \dots, X_m\) into account and obtain an improved estimator for \(\eta\) relative to using \(\widehat {\eta}\), the empirical distribution of the second sample. Surprisingly enough there may be a considerable gain in using \(X_1, \dots, X_m\) even in situations where the information (in the technical sense of semiparametric theory) in \(X_1, \dots, X_m\) alone is 0 and \(\sqrt {n}\)-consistent estimators based on the first sample alone do not exist. For \(m=n\) use of the additional sample always becomes visible as a cut in the asymptotic variance of the estimator. It is thus of interest to study estimators for \(\eta\) based on the whole set of observations. We show under some smoothness conditions that the maximum likelihood estimator for \(\eta\) attains a \(\sqrt {n}\)-rate and is asymptotically efficient for estimating \(\eta\) in the semiparametric sense.
      0 references
      asymptotic normality
      0 references
      mixture model
      0 references
      deconvolution
      0 references
      censoring
      0 references
      asymptotic efficiency
      0 references
      Donsker classes
      0 references
      smooth and nonsmooth kernels
      0 references
      empirical distribution
      0 references
      maximum likelihood estimator
      0 references
      semiparametric
      0 references
      0 references

      Identifiers