Communications of the ACM - Special issue on parallelism
The effectiveness and efficiency of agglomerative hierarchic clustering in document retrieval
The effectiveness and efficiency of agglomerative hierarchic clustering in document retrieval
Pictures of relevance: a geometric analysis of similarity measures
Journal of the American Society for Information Science
Techniques for the measurement of clustering tendency in document retrieval systems
Journal of Information Science
Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
A re-examination of relevance: toward a dynamic, situational definition
Information Processing and Management: an International Journal
Presenting results of experimental retrieval comparisons
Information Processing and Management: an International Journal - Special issue on evaluation issues in information retrieval
User-defined relevance criteria: an exploratory study
Journal of the American Society for Information Science - Special issue: relevance research
Query expansion using lexical-semantic relations
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Representing documents using an explicit model of their similarities
Journal of the American Society for Information Science
Query expansion using local and global document analysis
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
User-oriented document clustering: a framework for learning in information retrieval
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Journal of the American Society for Information Science
Real life, real users, and real needs: a study and analysis of user queries on the web
Information Processing and Management: an International Journal
Clustering user queries of a search engine
Proceedings of the 10th international conference on World Wide Web
Information Retrieval
The effectiveness of query-specific hierarchic clustering in information retrieval
Information Processing and Management: an International Journal
The SMART Retrieval System—Experiments in Automatic Document Processing
The SMART Retrieval System—Experiments in Automatic Document Processing
On ranking the effectiveness of searches
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A hierarchical approach for the redesign of chemical processes
Knowledge and Information Systems
Querying color images using user-specified wavelet features
Knowledge and Information Systems
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Estimating average precision when judgments are incomplete
Knowledge and Information Systems
Query-Oriented Summarization Based on Neighborhood Graph Model
ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Learning Similarity Functions in Graph-Based Document Summarization
ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Bayesian network based business information retrieval model
Knowledge and Information Systems
Content based similarity of geographic classes organized as partition hierarchies
Knowledge and Information Systems
Traveling among clusters: a way to reconsider the benefits of the cluster hypothesis
Proceedings of the 2010 ACM Symposium on Applied Computing
Query-oriented clustering: a multi-objective approach
Proceedings of the 2010 ACM Symposium on Applied Computing
A semantic similarity approach to predicting Library of Congress subject headings for social tags
Journal of the American Society for Information Science and Technology
Factors affecting web page similarity
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Measuring the complexity of a collection of documents
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
The optimum clustering framework: implementing the cluster hypothesis
Information Retrieval
Towards a unified approach based on affinity graph to various multi-document summarizations
ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries
Probabilistic co-relevance for query-sensitive similarity measurement in information retrieval
Information Processing and Management: an International Journal
Exploiting relevance, coverage, and novelty for query-focused multi-document summarization
Knowledge-Based Systems
Hi-index | 0.00 |
The application of document clustering to information retrieval has been motivated by the potential effectiveness gains postulated by the cluster hypothesis. The hypothesis states that relevant documents tend to be highly similar to each other and therefore tend to appear in the same clusters. In this paper we propose an axiomatic view of the hypothesis by suggesting that documents relevant to the same query (co-relevant documents) display an inherent similarity to each other that is dictated by the query itself. Because of this inherent similarity, the cluster hypothesis should be valid for any document collection. Our research describes an attempt to devise means by which this similarity can be detected. We propose the use of query-sensitive similarity measures that bias interdocument relationships toward pairs of documents that jointly possess attributes expressed in a query. We experimentally tested three query-sensitive measures against conventional ones that do not take the query into account, and we also examined the comparative effectiveness of the three query-sensitive measures. We calculated interdocument relationships for varying numbers of top-ranked documents for six document collections. Our results show a consistent and significant increase in the number of relevant documents that become nearest neighbors of any given relevant document when query-sensitive measures are used. These results suggest that the effectiveness of a cluster-based information retrieval system has the potential to increase through the use of query-sensitive similarity measures.