Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Analysis of a very large web search engine query log
ACM SIGIR Forum
A probabilistic model of information retrieval: development and comparative experiments
Information Processing and Management: an International Journal
Query-based sampling of text databases
ACM Transactions on Information Systems (TOIS)
Information Retrieval
Document normalization revisited
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic models of information retrieval based on measuring the divergence from randomness
ACM Transactions on Information Systems (TOIS)
A study of parameter tuning for term frequency normalization
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
A study of the dirichlet priors for term frequency normalisation
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Information Systems
Parameter sensitivity in the probabilistic model for ad-hoc retrieval
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Artificial Intelligence Review
Setting per-field normalisation hyper-parameters for the named-page finding search task
ECIR'07 Proceedings of the 29th European conference on IR research
Reverted indexing for feedback and expansion
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Document length normalization using effective level of term frequency in large collections
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
A constraint to automatically regulate document-length normalisation
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
The term frequency normalisation parameter tuning is a crucial issue in information retrieval (IR), which has an important impact on the retrieval performance. The classical pivoted normalisation approach suffers from the collection-dependence problem. As a consequence, it requires relevance assessment for each given collection to obtain the optimal parameter setting. In this paper, we tackle the collection-dependence problem by proposing a new tuning method by measuring the normalisation effect. The proposed method refines and extends our methodology described in [7]. In our experiments, we evaluate our proposed tuning method on various TREC collections, for both the normalisation 2 of the Divergence From Randomness (DFR) models and the BM25's normalisation method. Results show that for both normalisation methods, our tuning method significantly outperforms the robust empirically-obtained baselines over diverse TREC collections, while having a marginal computational cost.