Query expansion using local and global document analysis
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Distribution of content words and phrases in text and language modelling
Natural Language Engineering
A model of lexical attraction and repulsion
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
An evaluation of evolved term-weighting schemes in information retrieval
Proceedings of the 14th ACM international conference on Information and knowledge management
Hi-index | 0.03 |
Previous work addressing the issue of word distribution in documents has shown the importance of Word repetitiveness as an indicator of the word content-bearing characteristics. In this paper we propose a simple method using a measure of the tendency of words to repeat within a document to separate the words with similar document frequencies, but different topic discriminating characteristics. We describe the application of the new measure in query-document relevance scoring. Experiments on the TREC Ad Hoc and Spoken Document Retrieval tasks [7] show useful performance improvements.