A novel TF-IDF weighting scheme for effective ranking

  • Authors:
  • Jiaul H. Paik

  • Affiliations:
  • Indian Statistical Institute, Kolkata, India

  • Venue:
  • Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Term weighting schemes are central to the study of information retrieval systems. This article proposes a novel TF-IDF term weighting scheme that employs two different within document term frequency normalizations to capture two different aspects of term saliency. One component of the term frequency is effective for short queries, while the other performs better on long queries. The final weight is then measured by taking a weighted combination of these components, which is determined on the basis of the length of the corresponding query. Experiments conducted on a large number of TREC news and web collections demonstrate that the proposed scheme almost always outperforms five state of the art retrieval models with remarkable significance and consistency. The experimental results also show that the proposed model achieves significantly better precision than the existing models.