Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic models of information retrieval based on measuring the divergence from randomness
ACM Transactions on Information Systems (TOIS)
A study of smoothing methods for language models applied to information retrieval
ACM Transactions on Information Systems (TOIS)
A formal study of information retrieval heuristics
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
The BNB distribution for text modeling
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Do IR models satisfy the TDC retrieval constraint
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
A log-logistic model-based interpretation of TF normalization of BM25
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Pseudo test collections for training and tuning microblog rankers
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
We are interested in this paper in revisiting the Divergence from Randomness (DFR) approach to Information Retrieval (IR), so as to better understand the different contributions it relies on, and thus be able to simplify it. To do so, we first introduce an analytical characterization of heuristic retrieval constraints and review several DFR models wrt this characterization. This review shows that the first normalization principle of DFR is necessary to make the model compliant with retrieval constraints. We then show that the log-logistic distribution can be used to derive a simplified DFR model. Interestingly, this simplified model contains Language Models (LM) with Jelinek-Mercer smoothing. The relation we propose here is, to our knowledge, the first connection between the DFR and LM approaches. Lastly, we present experimental results obtained on several standard collections which validate the simplification and the model we propose.