Query-sensitive similarity measures for the calculation of interdocument relationships

Authors:
Anastasios Tombros;C. J. van Rijsbergen
Affiliations:
University of Glasgow, Glasgow, Scotland;University of Glasgow, Glasgow, Scotland
Venue:
Proceedings of the tenth international conference on Information and knowledge management
Year:
2001

Citing 17
Cited 7

Test of methods for evaluating bibliographic databases: an analysis of the National Library of Medicine's handling of literatures in the medical behavioral sciences

Journal of the American Society for Information Science
Toward memory-based reasoning

Communications of the ACM - Special issue on parallelism
The effectiveness and efficiency of agglomerative hierarchic clustering in document retrieval

The effectiveness and efficiency of agglomerative hierarchic clustering in document retrieval
Pictures of relevance: a geometric analysis of similarity measures

Journal of the American Society for Information Science
Techniques for the measurement of clustering tendency in document retrieval systems

Journal of Information Science
Recent trends in hierarchic document clustering: a critical review

Information Processing and Management: an International Journal
A re-examination of relevance: toward a dynamic, situational definition

Information Processing and Management: an International Journal
User-defined relevance criteria: an exploratory study

Journal of the American Society for Information Science - Special issue: relevance research
Query expansion using lexical-semantic relations

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Representing documents using an explicit model of their similarities

Journal of the American Society for Information Science
Reexamining the cluster hypothesis: scatter/gather on retrieval results

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
User-oriented document clustering: a framework for learning in information retrieval

Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Adaptive document clustering

SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Real life, real users, and real needs: a study and analysis of user queries on the web

Information Processing and Management: an International Journal
Information Retrieval

Information Retrieval
The effectiveness of query-specific hierarchic clustering in information retrieval

Information Processing and Management: an International Journal
The SMART Retrieval System—Experiments in Automatic Document Processing

The SMART Retrieval System—Experiments in Automatic Document Processing

A Query-Driven Approach to the Design and Management of Flexible Database Systems

Journal of Management Information Systems
Query-Oriented Summarization Based on Neighborhood Graph Model

ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
A New Measure of the Cluster Hypothesis

ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Revisit of nearest neighbor test for direct evaluation of inter-document similarities

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Intelligent document filter for the internet

Data Mining
Probabilistic co-relevance for query-sensitive similarity measurement in information retrieval

Information Processing and Management: an International Journal
Combining co-clustering with noise detection for theme-based summarization

ACM Transactions on Speech and Language Processing (TSLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The application of document clustering to information retrieval has been motivated by the potential effectiveness gains postulated by the Cluster Hypothesis. The hypothesis states that relevant documents tend to be highly similar to each other, and therefore tend to appear in the same clusters. In this paper we propose that, for any given query, pairs of relevant documents will exhibit an inherent similarity which is dictated by the query itself. Our research describes an attempt to devise means by which this similarity can be detected. We propose the use of query-sensitive similarity measures that bias interdocument relationships towards pairs of documents that jointly possess attributes that are expressed in a query. We experimentally tested query-sensitive measures against conventional ones that do not take the context of the query into account. We calculated interdocument relationships for varying numbers of top-ranked documents for five document collections. Our results show a consistent and significant increase in the number of relevant documents that become nearest neighbours of any given relevant document when query-sensitive measures are used. These results suggest that the effectiveness of a cluster-based IR system has the potential to increase through the use of query-sensitive similarity measures.