Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating document clustering for interactive information retrieval
Proceedings of the tenth international conference on Information and knowledge management
Information Retrieval
Modern Information Retrieval
ICSC '99 Proceedings of the 5th International Computer Science Conference on Internet Applications
Interactive information organization: techniques and evaluation
Interactive information organization: techniques and evaluation
Query-sensitive similarity measures for information retrieval
Knowledge and Information Systems
Web searching on the Vivisimo search engine
Journal of the American Society for Information Science and Technology
Introduction to Information Retrieval
Introduction to Information Retrieval
Genetic algorithm based multi-document summarization
PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Envisioning dynamic quantum clustering in information retrieval
QI'11 Proceedings of the 5th international conference on Quantum interaction
Hi-index | 0.00 |
Relying on the Cluster Hypothesis which states that relevant documents tend to be more similar one to each other than to non-relevant documents, most of information retrieval systems organizing search results as a set of clusters seek to gather all relevant documents in the same cluster. We propose here to reconsider the benefits of the entailed concentration of the relevant information. Contrary to what is commonly admitted, we believe that systems which aim to distribute the relevant documents in different clusters, since being more likely to highlight different aspects of the subject, may be at least as useful for the user as systems gathering all relevant documents in a single group. Since existing evaluation measures tend to greatly favor the latter systems, we first investigate ways to more fairly assess the ability to reach the relevant information from the list of cluster descriptions. At last, we show that systems distributing the relevant information in different clusters may actually provide a better information access than classical systems.