On relevance weights with little relevance information
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Using probabilistic models of document retrieval without relevance information
Readings in information retrieval
A theory of term weighting based on exploratory data analysis
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
A Note on Inverse Document Frequency Weighting Scheme
A Note on Inverse Document Frequency Weighting Scheme
A formal study of information retrieval heuristics
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Why inverse document frequency?
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Relevance information: a loss of entropy but a gain for IDF?
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Interpreting TF-IDF term weights as making relevance decisions
ACM Transactions on Information Systems (TOIS)
Generalized inverse document frequency
Proceedings of the 17th ACM conference on Information and knowledge management
Hi-index | 0.00 |
There have been a number of prior attempts to theoretically justify the effectiveness of the inverse document frequency (IDF). Those that take as their starting point Robertson and Sparck Jones's probabilistic model are based on strong or complex assumptions. We show that a more intuitively plausible assumption suffices. Moreover, the new assumption, while conceptually very simple, provides a solution to an estimation problem that had been deemed intractable by Robertson and Walker (1997).