Ensemble Estimators for Multivariate Entropy Estimation

Abstract: The problem of estimation of density functionals like entropy and mutual information has received much attention in the statistics and information theory communities. A large class of estimators of functionals of the probability density suffer from the curse of dimensionality, wherein the mean squared error (MSE) decays increasingly slowly as a function of the sample size

T

as the dimension

d

of the samples increases. In particular, the rate is often glacially slow of order

O (T^{- g a m m a / d})

, where

g a m m a > 0

is a rate parameter. Examples of such estimators include kernel density estimators,

k

-nearest neighbor (

k

-NN) density estimators,

k

-NN entropy estimators, intrinsic dimension estimators and other examples. In this paper, we propose a weighted affine combination of an ensemble of such estimators, where optimal weights can be chosen such that the weighted estimator converges at a much faster dimension invariant rate of

O (T^{- 1})

. Furthermore, we show that these optimal weights can be determined by solving a convex optimization problem which can be performed offline and does not require training data. We illustrate the superior performance of our weighted estimator for two important applications: (i) estimating the Panter-Dite distortion-rate factor and (ii) estimating the Shannon entropy for testing the probability distribution of a random sample.

Cited in

(5)

This page was built for publication: Ensemble Estimators for Multivariate Entropy Estimation

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5346463)