Language model for IR using collection information

  • Authors:
  • Rong Jin;Luo Si;Alex G. Hauptmann;Jamie Callan

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Information retrieval using meta data can be traced back to the early age of IR where documents are represented by the controlled vocabulary. In this paper, we explore the usage of meta-data information under the framework of language model. We present a new language model that is able to take advantage of the category information for documents to improve the retrieval accuracy. We compare the new language model with the traditional language model over the TREC4 dataset where the collection information for documents is obtained using the k-means clustering method. The new language model outperforms the traditional language model, which verifies our statement.