An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Verbs semantics and lexical selection
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Algorithmic detection of semantic similarity
WWW '05 Proceedings of the 14th international conference on World Wide Web
Using information content to evaluate semantic similarity in a taxonomy
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
A supervised machine learning approach for duplicate detection over gazetteer records
GeoS'11 Proceedings of the 4th international conference on GeoSpatial semantics
Hi-index | 0.00 |
Several authors have suggested similarity measures for objects labeled with terms from a hierarchical taxonomy We generalize this idea with a definition of information-theoretic similarity for taxonomies that are structured as directed acyclic graphs from which multiple terms may be used to describe an object We discuss how our definition should be adapted in the presence of ambiguity, and introduce new similarity measures based on our definitions. We present an implementation of our measures that is integrated with a relational database and scales to large taxonomies and datasets We evaluate our measures by applying them to an object-matching problem from bioinformatics, and show that, for this task, our new measures outperform those reported in the literature We also verified the scalability of our approach by applying it to patent similarity search, using patents classified with terms from the taxonomy defined by the United States Patent and Trademark Office.