Journal of the American Society for Information Science
Communications of the ACM - Special issue on parallelism
The effectiveness and efficiency of agglomerative hierarchic clustering in document retrieval
The effectiveness and efficiency of agglomerative hierarchic clustering in document retrieval
Pictures of relevance: a geometric analysis of similarity measures
Journal of the American Society for Information Science
Techniques for the measurement of clustering tendency in document retrieval systems
Journal of Information Science
Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
A re-examination of relevance: toward a dynamic, situational definition
Information Processing and Management: an International Journal
User-defined relevance criteria: an exploratory study
Journal of the American Society for Information Science - Special issue: relevance research
Query expansion using lexical-semantic relations
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Representing documents using an explicit model of their similarities
Journal of the American Society for Information Science
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
User-oriented document clustering: a framework for learning in information retrieval
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Real life, real users, and real needs: a study and analysis of user queries on the web
Information Processing and Management: an International Journal
Information Retrieval
The effectiveness of query-specific hierarchic clustering in information retrieval
Information Processing and Management: an International Journal
The SMART Retrieval System—Experiments in Automatic Document Processing
The SMART Retrieval System—Experiments in Automatic Document Processing
A Query-Driven Approach to the Design and Management of Flexible Database Systems
Journal of Management Information Systems
Query-Oriented Summarization Based on Neighborhood Graph Model
ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
A New Measure of the Cluster Hypothesis
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Revisit of nearest neighbor test for direct evaluation of inter-document similarities
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Intelligent document filter for the internet
Data Mining
Probabilistic co-relevance for query-sensitive similarity measurement in information retrieval
Information Processing and Management: an International Journal
Combining co-clustering with noise detection for theme-based summarization
ACM Transactions on Speech and Language Processing (TSLP)
Hi-index | 0.00 |
The application of document clustering to information retrieval has been motivated by the potential effectiveness gains postulated by the Cluster Hypothesis. The hypothesis states that relevant documents tend to be highly similar to each other, and therefore tend to appear in the same clusters. In this paper we propose that, for any given query, pairs of relevant documents will exhibit an inherent similarity which is dictated by the query itself. Our research describes an attempt to devise means by which this similarity can be detected. We propose the use of query-sensitive similarity measures that bias interdocument relationships towards pairs of documents that jointly possess attributes that are expressed in a query. We experimentally tested query-sensitive measures against conventional ones that do not take the context of the query into account. We calculated interdocument relationships for varying numbers of top-ranked documents for five document collections. Our results show a consistent and significant increase in the number of relevant documents that become nearest neighbours of any given relevant document when query-sensitive measures are used. These results suggest that the effectiveness of a cluster-based IR system has the potential to increase through the use of query-sensitive similarity measures.