An introduction to Kolmogorov complexity and its applications (2nd ed.)
An introduction to Kolmogorov complexity and its applications (2nd ed.)
A compression algorithm for DNA sequences and its applications in genome comparison
RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
A mathematical theory of communication
ACM SIGMOBILE Mobile Computing and Communications Review
IEEE Transactions on Information Theory
Shared information and program plagiarism detection
IEEE Transactions on Information Theory
Towards a universal information distance for structured data
Proceedings of the Fourth International Conference on SImilarity Search and APplications
A multivariate correlation distance for vector spaces
SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
Hi-index | 0.00 |
We show a new metric for comparing unordered, tree-structured data. While such data is increasingly important in its own right, the methodology underlying the construction of the metric is generic and may be reused for other classes of ordered and partially ordered data. The metric is based on the information content of the two values under consideration, which is measured using Shannon's entropy equations. In essence, the more commonality the values possess, the closer they are. As values in this domain may have no commonality, a good metric should be bounded to represent this. This property has been achieved, but is in tension with triangle inequality.