A study of parameter tuning for term frequency normalization

  • Authors:
  • Ben HE;Iadh Ounis

  • Affiliations:
  • University of Glasgow, Glasgow, UK;University of Glasgow, Glasgow, UK

  • Venue:
  • CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most current term frequency normalization approaches for information retrieval involve the use of parameters. The tuning of these parameters has an important impact on the overall performance of the information retrieval system. Indeed, a small variation in the involved parameter(s) could lead to an important variation in the precision/recall values. Most current tuning approaches are dependent on the document collections. As a consequence, the effective parameter value cannot be obtained for a given new collection without extensive training data. In this paper, we propose a novel and robust method for the tuning of term frequency normalization parameter(s), by measuring the normalization effect on the within document frequency of the query terms. As an illustration, we apply our method on Amati \& Van Rijsbergen's so-called normalization 2. The experiments for the ad-hoc TREC-6,7,8 tasks and TREC-8,9,10 Web tracks show that the new method is independent of the collections and able to provide reliable and good performance.