Exploring the cluster hypothesis, and cluster-based retrieval, over the web

Authors:
Fiana Raiber;Oren Kurland
Affiliations:
Technion - Israel Institute of Technology, Haifa, Israel;Technion - Israel Institute of Technology, Haifa, Israel
Venue:
Proceedings of the 21st ACM international conference on Information and knowledge management
Year:
2012

Citing 18
Cited 0

Techniques for the measurement of clustering tendency in document retrieval systems

Journal of Information Science
Reexamining the cluster hypothesis: scatter/gather on retrieval results

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
The cluster hypothesis revisited

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval

Information Retrieval
The effectiveness of query-specific hierarchic clustering in information retrieval

Information Processing and Management: an International Journal
Cluster-based retrieval using language models

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A Markov random field model for term dependencies

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Document re-ranking using cluster validation and label propagation

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
The opposite of smoothing: a language model approach to ranking query-specific document clusters

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Clusters, language models, and ad hoc information retrieval

ACM Transactions on Information Systems (TOIS)
Re-ranking search results using language models of query-specific clusters

Information Retrieval
A New Measure of the Cluster Hypothesis

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Evaluating text representations for retrieval of the best group of documents

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Revisit of nearest neighbor test for direct evaluation of inter-document similarities

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Geometric representations for multiple documents

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a study of the cluster hypothesis, and of the performance of cluster-based retrieval methods, performed over large scale Web collections. Among the findings we present are (i) the cluster hypothesis can hold, as determined by a specific test, for large scale Web corpora to the same extent it does for newswire corpora; (ii) while spam documents do not affect the extent to which the cluster hypothesis holds, they considerably affect the performance of cluster based, as well as that of document-based, retrieval methods; and, (iii) as is the case for newswire corpora, cluster-based methods can yield better performance than document-based methods for Web corpora.