Estimating entropy on m bins given fewer than m samples

Authors:
L. Paninski
Affiliations:
Univ. Coll. London, UK
Venue:
IEEE Transactions on Information Theory
Year:
2006

Citing 0
Cited 9

Quantifying Stimulus Discriminability: A Comparison of Information Theory and Ideal Observer Analysis

Neural Computation
Approximating entropy from sublinear samples

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Inferring Connectivity of Genetic Regulatory Networks Using Information-Theoretic Criteria

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Indices for testing neural codes

Neural Computation
A distribution-based approach to anomaly detection and application to 3G mobile traffic

GLOBECOM'09 Proceedings of the 28th IEEE conference on Global telecommunications
Feature extraction from spike trains with Bayesian binning: `Latency is where the signal starts'

Journal of Computational Neuroscience
Implementation and testing of high-speed CMOS true random number generators based on chaotic systems

IEEE Transactions on Circuits and Systems Part I: Regular Papers - Special section on 2009 IEEE system-on-chip conference
Estimating the unseen: an n/log(n)-sample estimator for entropy and support size, shown optimal via new CLTs

Proceedings of the forty-third annual ACM symposium on Theory of computing
Taming big probability distributions

XRDS: Crossroads, The ACM Magazine for Students - Big Data

Quantified Score

Hi-index	754.84

Visualization

Abstract

Consider a sequence pN of discrete probability measures, supported on mN points, and assume that we observe N independent and identically distributed (i.i.d.) samples from each pN. We demonstrate the existence of an estimator of the entropy, H(pN), which is consistent even if the ratio N/mN is bounded (and, as a corollary, even if this ratio tends to zero, albeit at a sufficiently slow rate).