CRTER: using cross terms to enhance probabilistic information retrieval

Authors:
Jiashu Zhao;Jimmy Xiangji Huang;Ben He
Affiliations:
York University, Toronto, ON, Canada;York University, Toronto, ON, Canada;York University, Toronto, ON, Canada
Venue:
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Year:
2011

Citing 17
Cited 3

Effective document presentation with a locality-based similarity heuristic

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A general language model for information retrieval

Proceedings of the eighth international conference on Information and knowledge management
Biterm language models for document retrieval

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Single n-gram stemming

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Exploring term dependences in probabilistic information retrieval model

Information Processing and Management: an International Journal
Character N-Gram Tokenization for European Language Text Retrieval

Information Retrieval
Dependence language model for information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
An information retrieval model using the fuzzy proximity degree of term occurences

Proceedings of the 2005 ACM symposium on Applied computing
A Markov random field model for term dependencies

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)

TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
Term proximity scoring for ad-hoc retrieval on very large text collections

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
An exploration of proximity measures in information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Proximity-based document representation for named entity retrieval

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Proximity-aware scoring for XML retrieval

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Evaluation of n-gram conflation approaches for Arabic text retrieval

Journal of the American Society for Information Science and Technology
Positional language models for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Term proximity scoring for keyword-based retrieval systems

ECIR'03 Proceedings of the 25th European conference on IR research

Proximity-based rocchio's model for pseudo relevance

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Modeling geographic, temporal, and proximity contexts for improving geotemporal search

Journal of the American Society for Information Science and Technology
Exploiting proximity feature in statistical translation models for information retrieval

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Term proximity retrieval rewards a document where the matched query terms occur close to each other. Although term proximity is known to be effective in many Information Retrieval (IR) applications, the within-document distribution of each individual query term and how the query terms associate with each other, are not fully considered. In this paper, we introduce a pseudo term, namely Cross Term, to model term proximity for boosting retrieval performance. An occurrence of a query term is assumed to have an impact towards its neighboring text, which gradually weakens with the increase of the distance to the place of occurrence. We use a shape function to characterize such an impact. A Cross Term occurs when two query terms appear close to each other and their impact shape functions have an intersection. We propose a Cross Term Retrieval (CRTER) model that combines the Cross Terms' information with basic probabilistic weighting models to rank the retrieved documents. Extensive experiments on standard TREC collections illustrate the effectiveness of our proposed CRTER model.