Word sense language model for information retrieval

  • Authors:
  • Liqi Gao;Yu Zhang;Ting Liu;Guiping Liu

  • Affiliations:
  • Information Retrieval Laboratory, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, P.R. China;Information Retrieval Laboratory, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, P.R. China;Information Retrieval Laboratory, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, P.R. China;Information Retrieval Laboratory, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, P.R. China

  • Venue:
  • AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper proposes a word sense language model based method for information retrieval. This method, differing from most of traditional ones, combines word senses defined in a thesaurus with a classic statistical model. The word sense language model regards the word sense as a form of linguistic knowledge, which is helpful in handling mismatch caused by synonym and data sparseness due to data limit. Experimental results based on TREC-Mandarin corpus show that this method gains 12.5% improvement on MAP over traditional tf-idf retrieval method but 5.82% decrease on MAP compared to a classic language model. A combination result of this method and the language model yields 8.92% and 7.93% increases over either respectively. We present analysis and discussions on the not-so-exciting results and conclude that a higher performance of word sense language model will owe to high accurate of word sense labeling. We believe that linguistic knowledge such as word sense of a thesaurus will help IR improve ultimately in many ways.