Investigation of partial query proximity in web search

Authors:
Jing Bai;Yi Chang;Hang Cui;Zhaohui Zheng;Gordon Sun;Xin Li
Affiliations:
Yahoo! Inc., Sunnyvale, USA;Yahoo! Inc., Sunnyvale, USA;Yahoo! Inc., Sunnyvale, USA;Yahoo! Inc., Sunnyvale, USA;Yahoo! Inc., Sunnyvale, USA;Yahoo! Inc., Sunnyvale, USA
Venue:
Proceedings of the 17th international conference on World Wide Web
Year:
2008

Citing 3
Cited 8

Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
A Markov random field model for term dependencies

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
An exploration of proximity measures in information retrieval

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

Building enriched document representations using aggregated anchor text

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Learning in a pairwise term-term proximity framework for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Two-stage query segmentation for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Learning concept importance using a weighted dependence model

Proceedings of the third ACM international conference on Web search and data mining
Learning to efficiently rank

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Key concepts identification and weighting in search engine queries

APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Improving recency ranking using twitter data

ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on twitter and microblogging services, social recommender systems, and CAMRa2010: Movie recommendation in context
Modeling term dependencies with quantum language models for IR

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Proximity of query terms in a document is an important criterion in IR. However, no investigation has been made to determine the most useful term sequences for which proximity should be considered. In this study, we test the effectiveness of using proximity of partial term sequences (n-grams) for Web search. We observe that the proximity of sequences of 3 to 5 terms is most effective for long queries, while shorter or longer sequences appear less useful. This suggests that combinations of 3 to 5 terms can best capture the intention in user queries. In addition, we also experiment with weighing the importance of query sub-sequences using query log frequencies. Our preliminary tests show promising empirical results.