Traveling among clusters: a way to reconsider the benefits of the cluster hypothesis

Authors:
Sylvain Lamprier;Tassadit Amghar;Frédéric Saubion;Bernard Levrat
Affiliations:
Université Pierre et, Marie Curie;Université d'Angers;Université d'Angers;Université d'Angers
Venue:
Proceedings of the 2010 ACM Symposium on Applied Computing
Year:
2010

Citing 12
Cited 1

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Scatter/Gather: a cluster-based approach to browsing large document collections

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating document clustering for interactive information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Information Retrieval

Information Retrieval
Modern Information Retrieval

Modern Information Retrieval
Query Length, Number of Classes and Routes through Clusters: Experiments with a Clustering Method for Information Retrieval

ICSC '99 Proceedings of the 5th International Computer Science Conference on Internet Applications
Interactive information organization: techniques and evaluation

Interactive information organization: techniques and evaluation
Query-sensitive similarity measures for information retrieval

Knowledge and Information Systems
Web searching on the Vivisimo search engine

Journal of the American Society for Information Science and Technology
Introduction to Information Retrieval

Introduction to Information Retrieval
Genetic algorithm based multi-document summarization

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence

Envisioning dynamic quantum clustering in information retrieval

QI'11 Proceedings of the 5th international conference on Quantum interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

Relying on the Cluster Hypothesis which states that relevant documents tend to be more similar one to each other than to non-relevant documents, most of information retrieval systems organizing search results as a set of clusters seek to gather all relevant documents in the same cluster. We propose here to reconsider the benefits of the entailed concentration of the relevant information. Contrary to what is commonly admitted, we believe that systems which aim to distribute the relevant documents in different clusters, since being more likely to highlight different aspects of the subject, may be at least as useful for the user as systems gathering all relevant documents in a single group. Since existing evaluation measures tend to greatly favor the latter systems, we first investigate ways to more fairly assess the ability to reach the relevant information from the list of cluster descriptions. At last, we show that systems distributing the relevant information in different clusters may actually provide a better information access than classical systems.