Exploiting proximity feature in bigram language model for information retrieval

Authors:
Seung-Hoon Na;Jungi Kim;In-Su Kang;Jong-Hyeok Lee
Affiliations:
Pohang University of Science and Technology (POSTECH), Pohang, South Korea;Pohang University of Science and Technology (POSTECH), Pohang, South Korea;Kyungsung University, Pusan, South Korea;Pohang University of Science and Technology (POSTECH), Pohang, South Korea
Venue:
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2008

Citing 4
Cited 1

A general language model for information retrieval

Proceedings of the eighth international conference on Information and knowledge management
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Dependence language model for information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
An exploration of proximity measures in information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

Learning in a pairwise term-term proximity framework for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Language modeling approaches have been effectively dealing with the dependency among query terms based on N-gram such as bigram or trigram models. However, bigram language models suffer from adjacency-sparseness problem which means that dependent terms are not always adjacent in documents, but can be far from each other, sometimes with distance of a few sentences in a document. To resolve the adjacency-sparseness problem, this paper proposes a new type of bigram language model by explicitly incorporating the proximity feature between two adjacent terms in a query. Experimental results on three test collections show that the proposed bigram language model significantly improves previous bigram model as well as Tao's approach, the state-of-art method for proximity-based method.