Sample complexity of the distinct elements problem (Q1737973)

From MaRDI portal





scientific article
Language Label Description Also known as
default for all languages
No label defined
    English
    Sample complexity of the distinct elements problem
    scientific article

      Statements

      Sample complexity of the distinct elements problem (English)
      0 references
      0 references
      0 references
      24 April 2019
      0 references
      Summary: We consider the distinct elements problem, where the goal is to estimate the number of distinct colors in an urn containing \(k\) balls based on \(n\) samples drawn with replacements. Based on discrete polynomial approximation and interpolation, we propose an estimator with additive error guarantee that achieves the optimal sample complexity within \(O(\log\log k)\) factors, and in fact within constant factors for most cases. The estimator can be computed in \(O(n)\) time for an accurate estimation. The result also applies to sampling without replacement provided the sample size is a vanishing fraction of the urn size. One of the key auxiliary results is a sharp bound on the minimum singular values of a real rectangular Vandermonde matrix, which might be of independent interest.
      0 references
      sampling large population
      0 references
      nonparametric statistics
      0 references
      discrete polynomial approximation
      0 references
      orthogonal polynomials
      0 references
      Vandermonde matrix
      0 references
      minimaxity
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references

      Identifiers

      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references