Estimation of query model from parsimonious translation model

  • Authors:
  • Seung-Hoon Na;In-Su Kang;Sin-Jae Kang;Jong-Hyeok Lee

  • Affiliations:
  • Division of Electrical and Computer Engineering, POSTECH, AITrc, Republic of Korea;Division of Electrical and Computer Engineering, POSTECH, AITrc, Republic of Korea;School of Computer and Information Technology, Daegu University, Republic of Korea;Division of Electrical and Computer Engineering, POSTECH, AITrc, Republic of Korea

  • Venue:
  • AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

The KL divergence framework, the extended language modeling approach, have a critical problem with estimation of query model, which is the probabilistic model that encodes user's information need. However, at initial retrieval, it is difficult to expand query model using co-occurrence, because the two-dimensional matrix information such as term co-occurrence must be constructed in offline. Especially in large collection, constructing such large matrix of term co-occurrences prohibitively increases time and space complexity. This paper proposes an effective method to construct co-occurrence statistics by employing parsimonious translation model. Parsimonious translation model is a compact version of translation model, and it contains very small number of parameters that includes non-zero probabilities. Parsimonious translation model enables us to enormously reduce the number of remaining terms in document so that co-occurrence statistics can be calculated in tractable time. In experimentations, the results show that query model derived from parsimonious translation model significantly improves baseline language modeling performance.