Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Automatic routing and retrieval using Smart: TREC-2
TREC-2 Proceedings of the second conference on Text retrieval conference
Information Processing and Management: an International Journal - Special issue: history of information science
Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Exploring the similarity space
ACM SIGIR Forum
A probabilistic model of information retrieval: development and comparative experiments Part 2
Information Processing and Management: an International Journal
Models in information retrieval
Lectures on information retrieval
Simplified similarity scoring using term ranks
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
The influence of caption features on clickthrough patterns in web search
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
An exploration of proximity measures in information retrieval
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Latent concept expansion using markov random fields
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A statistical view of binned retrieval models
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Index ordering by query-independent measures
Information Processing and Management: an International Journal
LePrEF: Learn to precompute evidence fusion for efficient query evaluation
Journal of the American Society for Information Science and Technology
Hi-index | 0.00 |
The BM25 similarity computation has been shown to provide effective document retrieval. In operational terms, the formulae which form the basis for BM25 employ both term frequency and document length normalization. This paper considers an alternative form of normalization using document-centric impacts, and shows that the new normalization simplifies BM25 and reduces the number of tuning parameters. Motivation is provided by a preliminary analysis of a document collection that shows that impacts are more likely to identify documents whose lengths resemble those of the relevant judgments.Experiments on TREC data demonstrate that impact-based BM25 is as good as or better than the original term frequency-based BM25 in terms of retrieval effectiveness.