Elements of information theory
Elements of information theory
Information Retrieval
Modern Information Retrieval
An Information-Theoretic Definition of Similarity
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Language identification in web pages
Proceedings of the 2005 ACM symposium on Applied computing
The earth mover's distance as a semantic measure for document similarity
Proceedings of the 14th ACM international conference on Information and knowledge management
A novel document similarity measure based on earth mover's distance
Information Sciences: an International Journal
A trust based approach for protecting user data in social networks
CASCON '07 Proceedings of the 2007 conference of the center for advanced studies on Collaborative research
Computer-based plagiarism detection methods and tools: an overview
CompSysTech '07 Proceedings of the 2007 international conference on Computer systems and technologies
Beyond topical similarity: a structural similarity measure for retrieving highly similar documents
Knowledge and Information Systems
Proceedings of the International Conference on Advances in Computing, Communication and Control
Using Link-Based Content Analysis to Measure Document Similarity Effectively
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Measuring the incremental information value of documents
Information Sciences: an International Journal
A survey of Chinese text similarity computation
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Language identification: the long and the short of the matter
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
A measure based on optimal matching in graph theory for document similarity
AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
Using proportional transportation distances for measuring document similarity
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
Hi-index | 0.00 |
Recent work has demonstrated that the assessment of pairwise object similarity can be approached in an axiomatic manner using information theory. We extend this concept specifically to document similarity and test the effectiveness of an information-theoretic measure for pairwise document similarity. We adapt query retrieval to rate the quality of document similarity measures and demonstrate that our proposed information-theoretic measure for document similarity yields statistically significant improvements over other popular measures of similarity.