Discretization based learning approach to information retrieval

  • Authors:
  • Dmitri Roussinov;Weiguo Fan

  • Affiliations:
  • Arizona State University, Tempe, AZ;Virginia Polytechnic Institute and State University, Blacksburg, VA

  • Venue:
  • HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We approached the problem as learning how to order documents by estimated relevance with respect to a user query. Our support vector machines based classifier learns from the relevance judgments available with the standard test collections and generalizes to new, previously unseen queries. For this, we have designed a representation scheme, which is based on the discrete representation of the local (lw) and global (gw) weighting functions, thus is capable of reproducing and enhancing the properties of such popular ranking functions as tf.idf, BM25 or those based on language models. Our tests with the standard test collections have demonstrated the capability of our approach to achieve the performance of the best known scoring functions solely from the labeled examples and without taking advantage of knowing those functions or their important properties or parameters.