Hidden markov model for term weighting in verbose queries

Authors:
Xueliang Yan;Guanglai Gao;Xiangdong Su;Hongxi Wei;Xueliang Zhang;Qianqian Lu
Affiliations:
College of Computer Science, Inner Mongolia University, Hohhot, China;College of Computer Science, Inner Mongolia University, Hohhot, China;College of Computer Science, Inner Mongolia University, Hohhot, China;College of Computer Science, Inner Mongolia University, Hohhot, China;College of Computer Science, Inner Mongolia University, Hohhot, China;College of Computer Science, Inner Mongolia University, Hohhot, China
Venue:
CLEF'12 Proceedings of the Third international conference on Information Access Evaluation: multilinguality, multimodality, and visual analytics
Year:
2012

Citing 7
Cited 0

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
A probabilistic model of information retrieval: development and comparative experiments

Information Processing and Management: an International Journal
Combining the language model and inference network approaches to retrieval

Information Processing and Management: an International Journal - Special issue: Bayesian networks and information retrieval
Feature-rich part-of-speech tagging with a cyclic dependency network

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Effective and efficient user interaction for long queries

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Selecting good expansion terms for pseudo-relevance feedback

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Discovering key concepts in verbose queries

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

It has been observed that short queries generally have better performance than their corresponding long versions when retrieved by the same IR model. This is mainly because most of the current models do not distinguish the importance of different terms in the query. Observed that sentence-like queries encode information related to the term importance in the grammatical structure, we propose a Hidden Markov Model (HMM) based method to extract such information to do term weighting. The basic idea of choosing HMM is motivated by its successful application in capturing the relationship between adjacent terms in NLP field. Since we are dealing with queries of natural language form, we think that HMM can also be used to capture the dependence between the weights and the grammatical structures. Our experiments show that our assumption is quite reasonable and that such information, when utilized properly, can greatly improve retrieval performance.