The opposite of smoothing: a language model approach to ranking query-specific document clusters

Authors:
Oren Kurland
Affiliations:
Technion, Haifa, Israel
Venue:
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Year:
2008

Citing 30
Cited 17

Query expansion using local and global document analysis

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
The cluster hypothesis revisited

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Using interdocument similarity information in document retrieval systems

Readings in information retrieval
Web document clustering: a feasibility demonstration

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Demonstration of hierarchical document clustering of digital library retrieval results

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Document language models, query models, and risk minimization for information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating document clustering for interactive information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Information Retrieval

Information Retrieval
A language modeling framework for resource selection and results merging

Proceedings of the eleventh international conference on Information and knowledge management
The effectiveness of query-specific hierarchic clustering in information retrieval

Information Processing and Management: an International Journal
Language Modeling for Information Retrieval

Language Modeling for Information Retrieval
Cluster-based retrieval using language models

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Corpus structure, language models, and ad hoc information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
PageRank without hyperlinks: structural re-ranking using links induced by language models

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Regularizing ad hoc retrieval scores

Proceedings of the 14th ACM international conference on Information and knowledge management
Automatically labeling hierarchical clusters

dg.o '06 Proceedings of the 2006 international conference on Digital government research
Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Improving the estimation of relevance models using large external corpora

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
LDA-based document models for ad-hoc retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Representing clusters for retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Inter-document similarities, language models, and ad hoc information retrieval

Inter-document similarities, language models, and ad hoc information retrieval
Automatic labeling of multinomial topic models

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Relevance models for topic detection and tracking

HLT '02 Proceedings of the second international conference on Human Language Technology Research
A rank-aggregation approach to searching for optimal query-specific clusters

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Cluster generation and cluster labelling for web snippets: a fast and accurate hierarchical solution

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval

A rank-aggregation approach to searching for optimal query-specific clusters

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Re-ranking search results using language models of query-specific clusters

Information Retrieval
Utilizing inter-passage and inter-document similarities for re-ranking search results

Proceedings of the 18th ACM conference on Information and knowledge management
A query model based on normalized log-likelihood

Proceedings of the 18th ACM conference on Information and knowledge management
Conceptual language models for domain-specific retrieval

Information Processing and Management: an International Journal
Using statistical decision theory and relevance models for query-performance prediction

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Inducing word senses to improve web search result clustering

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
On identifying representative relevant documents

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Utilizing inter-passage and inter-document similarities for reranking search results

ACM Transactions on Information Systems (TOIS)
Cluster-based fusion of retrieved lists

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Re-ranking search results using an additional retrieved list

Information Retrieval
Clustering web search results with maximum spanning trees

AI*IA'11 Proceedings of the 12th international conference on Artificial intelligence around man and beyond
Keyword search over RDF graphs

Proceedings of the 20th ACM international conference on Information and knowledge management
A cluster based pseudo feedback technique which exploits good and bad clusters

CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
The optimum clustering framework: implementing the cluster hypothesis

Information Retrieval
Query-performance prediction and cluster ranking: two sides of the same coin

Proceedings of the 21st ACM international conference on Information and knowledge management
Exploring the cluster hypothesis, and cluster-based retrieval, over the web

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Exploiting information induced from (query-specific) clustering of top-retrieved documents has long been proposed as means for improving precision at the very top ranks of the returned results. We present a novel language model approach to ranking query-specific clusters by the presumed percentage of relevant documents that they contain. While most previous cluster ranking approaches focus on the cluster as a whole, our model also exploits information induced from documents associated with the cluster. Our model substantially outperforms previous approaches for identifying clusters containing a high relevant-document percentage. Furthermore, using the model to produce document ranking yields precision-at-top-ranks performance that is consistently better than that of the initial ranking upon which clustering is performed; the performance also favorably compares with that of a state-of-the-art pseudo-feedback retrieval method.