Progress in estimation of mutual information for real-valued data
ORAL
Abstract
Estimation of mutual information between (multidimensional) real-valued variables is used in analysis of complex systems, biological systems, and recently also quantum systems. The estimation is a hard problem, and universally good estimators provably do not exist. Kraskov et al (PRE, 2004) introduced a successful mutual information estimation approach based on the statistics of distances between neighboring data points, which empirically works for a wide class of underlying probability distributions. Here we improve their estimator in a number of ways. First, we use the reparameterization invariance property of mutual information to extend the estimator to work better for long-tailed and heavily skewed distributions. Second, we use subsampling techniques (more traditional resampling, such as bootstrap, produce biased results for mutual information estimation) to develop an estimate of the variance and the bias of the resulting estimator. We demonstrate the performance of our estimator on synthetic data sets, as well as on neuro- and systems biology datasets.
–
Presenters
-
Ilya Nemenman
Emory Univ, Emory University, Department of Physics, Department of Biology, Emory University
Authors
-
Caroline Holmes
Princeton Univ
-
Ilya Nemenman
Emory Univ, Emory University, Department of Physics, Department of Biology, Emory University