Re-ranking search results using language models of query-specific clusters

Authors:
Oren Kurland
Affiliations:
Faculty of Industrial Engineering and Management, Technion--Israel Institute of Technology, Haifa, Israel 32000
Venue:
Information Retrieval
Year:
2009

Citing 38
Cited 9

Elements of information theory

Elements of information theory
Scatter/Gather: a cluster-based approach to browsing large document collections

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchic document classification using Ward's clustering method

Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Using interdocument similarity information in document retrieval systems

Readings in information retrieval
Web document clustering: a feasibility demonstration

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A general language model for information retrieval (poster abstract)

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Re-ranking model based on document clusters

Information Processing and Management: an International Journal
Document language models, query models, and risk minimization for information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating document clustering for interactive information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Information Retrieval

Information Retrieval
The effectiveness of query-specific hierarchic clustering in information retrieval

Information Processing and Management: an International Journal
Error analysis of difficult TREC topics

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Language Modeling for Information Retrieval

Language Modeling for Information Retrieval
Latent dirichlet allocation

The Journal of Machine Learning Research
A survey on the use of relevance feedback for information access systems

The Knowledge Engineering Review
Evaluating high accuracy retrieval techniques

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Cluster-based retrieval using language models

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Corpus structure, language models, and ad hoc information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Better than the real thing?: iterative pseudo-query processing using cluster-based language models

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
PageRank without hyperlinks: structural re-ranking using links induced by language models

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Improving web search results using affinity graph

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Re-ranking method based on inter-document distances

Information Processing and Management: an International Journal
Regularizing ad hoc retrieval scores

Proceedings of the 14th ACM international conference on Information and knowledge management
Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Improving the estimation of relevance models using large external corpora

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
LDA-based document models for ad-hoc retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Representing clusters for retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Language model information retrieval with document expansion

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Inter-document similarities, language models, and ad hoc information retrieval

Inter-document similarities, language models, and ad hoc information retrieval
Relevance models for topic detection and tracking

HLT '02 Proceedings of the second international conference on Human Language Technology Research
The opposite of smoothing: a language model approach to ranking query-specific document clusters

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A rank-aggregation approach to searching for optimal query-specific clusters

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating text representations for retrieval of the best group of documents

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval

Cluster-based fusion of retrieved lists

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Negation for document re-ranking in ad-hoc retrieval

ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
The opposite of smoothing: a language model approach to ranking query-specific document clusters

Journal of Artificial Intelligence Research
Fusing different information retrieval systems according to query-topics: a study based on correlation in information retrieval systems and TREC topics

Information Retrieval
A study of the integration of passage-, document-, and cluster-based information for re-ranking search results

Information Retrieval
A cluster based pseudo feedback technique which exploits good and bad clusters

CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Exploring the cluster hypothesis, and cluster-based retrieval, over the web

Proceedings of the 21st ACM international conference on Information and knowledge management
Ranking document clusters using markov random fields

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
A novel neighborhood based document smoothing model for information retrieval

Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

To obtain high precision at top ranks by a search performed in response to a query, researchers have proposed a cluster-based re-ranking paradigm: clustering an initial list of documents that are the most highly ranked by some initial search, and using information induced from these (often called) query-specific clusters for re-ranking the list. However, results concerning the effectiveness of various automatic cluster-based re-ranking methods have been inconclusive. We show that using query-specific clusters for automatic re-ranking of top-retrieved documents is effective with several methods in which clusters play different roles, among which is the smoothing of document language models. We do so by adapting previously-proposed cluster-based retrieval approaches, which are based on (static) query-independent clusters for ranking all documents in a corpus, to the re-ranking setting wherein clusters are query-specific. The best performing method that we develop outperforms both the initial document-based ranking and some previously proposed cluster-based re-ranking approaches; furthermore, this algorithm consistently outperforms a state-of-the-art pseudo-feedback-based approach. In further exploration we study the performance of cluster-based smoothing methods for re-ranking with various (soft and hard) clustering algorithms, and demonstrate the importance of clusters in providing context from the initial list through a comparison to using single documents to this end.