An axiomatic comparison of learned term-weighting schemes in information retrieval: clarifications and extensions

Authors:
Ronan Cummins;Colm O'Riordan
Affiliations:
Department of Information Technology, National University of Ireland, Galway, Ireland;Department of Information Technology, National University of Ireland, Galway, Ireland
Venue:
Artificial Intelligence Review
Year:
2007

Citing 13
Cited 7

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Evaluating evaluation measure stability

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval: Computational and Theoretical Aspects

Information Retrieval: Computational and Theoretical Aspects
Document normalization revisited

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining tf.idf based information retrieval with genetic programming

SAICSIT '02 Proceedings of the 2002 annual research conference of the South African institute of computer scientists and information technologists on Enablement through technology
A study of parameter tuning for term frequency normalization

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
A formal study of information retrieval heuristics

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A generic ranking function discovery framework by genetic programming for information retrieval

Information Processing and Management: an International Journal
Learning to Rank

Information Retrieval
An exploration of axiomatic approaches to information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
An evaluation of evolved term-weighting schemes in information retrieval

Proceedings of the 14th ACM international conference on Information and knowledge management
Evolving local and global weighting schemes in information retrieval

Information Retrieval
Term frequency normalisation tuning for BM25 and DFR models

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research

Measuring constraint violations in information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Do IR models satisfy the TDC retrieval constraint

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Lower-bounding term frequency normalization

Proceedings of the 20th ACM international conference on Information and knowledge management
An information-based cross-language information retrieval model

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
A constraint to automatically regulate document-length normalisation

Proceedings of the 21st ACM international conference on Information and knowledge management
A Theoretical Analysis of Pseudo-Relevance Feedback Models

Proceedings of the 2013 Conference on the Theory of Information Retrieval
Towards efficient indexing of arbitrary similarity: vision paper

ACM SIGMOD Record

Quantified Score

Hi-index	0.00

Visualization

Abstract

Machine learning approaches to information retrieval are becoming increasingly widespread. In this paper, we present term-weighting functions reported in the literature that were developed by four separate approaches using genetic programming. Recently, a number of axioms (constraints), from which all good term-weighting schemes should be deduced, have been developed and shown to be theoretically and empirically sound. We introduce a new axiom and empirically validate it by modifying the standard BM25 scheme. Furthermore, we analyse the BM25 scheme and the four learned schemes presented to determine if the schemes are consistent with the axioms. We find that one learned term-weighting approach is consistent with more axioms than any of the other schemes. An empirical evaluation of the schemes on various test collections and query lengths shows that the scheme that is consistent with more of the axioms outperforms the other schemes.