A measure of relative entropy between individual sequences with application to universal classification

Authors:
J. Ziv;N. Merhav
Affiliations:
Dept. of Electr. Eng., Technion-Israel Inst. of Technol., Haifa;-
Venue:
IEEE Transactions on Information Theory
Year:
2006

Citing 0
Cited 2

Distance measures for biological sequences: Some recent approaches

International Journal of Approximate Reasoning
Measuring structural similarity of semistructured data based on information-theoretic approaches

The VLDB Journal — The International Journal on Very Large Data Bases

Quantified Score

Hi-index	754.84

Visualization

Abstract

A new notion of empirical informational divergence (relative entropy) between two individual sequences is introduced. If the two sequences are independent realizations of two finite-order, finite alphabet, stationary Markov processes, the empirical relative entropy converges to the relative entropy almost surely. This empirical divergence is based on a version of the Lempel-Ziv data compression algorithm. A simple universal algorithm for classifying individual sequences into a finite number of classes, which is based on the empirical divergence, is introduced. The algorithm discriminates between the classes whenever they are distinguishable by some finite-memory classifier for almost every given training set and almost any test sequence from these classes. It is universal in the sense that it is independent of the unknown sources