Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchic document classification using Ward's clustering method
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
The cluster hypothesis revisited
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A hidden Markov model information retrieval system
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing
Communications of the ACM
Document language models, query models, and risk minimization for information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Query-sensitive similarity measures for the calculation of interdocument relationships
Proceedings of the tenth international conference on Information and knowledge management
The effectiveness of query-specific hierarchic clustering in information retrieval
Information Processing and Management: an International Journal
A Linguistically Motivated Probabilistic Model of Information Retrieval
ECDL '98 Proceedings of the Second European Conference on Research and Advanced Technology for Digital Libraries
Cluster-based retrieval using language models
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Corpus structure, language models, and ad hoc information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
PageRank without hyperlinks: structural re-ranking using links induced by language models
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Query-sensitive similarity measures for information retrieval
Knowledge and Information Systems
A parallel derivation of probabilistic information retrieval models
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Query-Sensitive Similarity Measure for Content-Based Image Retrieval
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Language model information retrieval with document expansion
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
An empirical study of query expansion and cluster-based retrieval in language modeling approach
Information Processing and Management: an International Journal - Special issue: AIRS2005: Information retrieval research in Asia
Relevance models for topic detection and tracking
HLT '02 Proceedings of the second international conference on Human Language Technology Research
An analysis on document length retrieval trends in language modeling smoothing
Information Retrieval
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A New Measure of the Cluster Hypothesis
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
A comparative study of methods for estimating query language models with pseudo feedback
Proceedings of the 18th ACM conference on Information and knowledge management
Probabilistic document length priors for language models
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Fast query expansion using approximations of relevance models
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
A Generative Theory of Relevance
A Generative Theory of Relevance
The optimum clustering framework: implementing the cluster hypothesis
Information Retrieval
A design of knowledge management tool for supporting product development
Information Processing and Management: an International Journal
Hi-index | 0.00 |
Interdocument similarities are the fundamental information source required in cluster-based retrieval, which is an advanced retrieval approach that significantly improves performance during information retrieval (IR). An effective similarity metric is query-sensitive similarity, which was introduced by Tombros and Rijsbergen as method to more directly satisfy the cluster hypothesis that forms the basis of cluster-based retrieval. Although this method is reported to be effective, existing applications of query-specific similarity are still limited to vector space models wherein there is no connection to probabilistic approaches. We suggest a probabilistic framework that defines query-sensitive similarity based on probabilistic co-relevance, where the similarity between two documents is proportional to the probability that they are both co-relevant to a specific given query. We further simplify the proposed co-relevance-based similarity by decomposing it into two separate relevance models. We then formulate all the requisite components for the proposed similarity metric in terms of scoring functions used by language modeling methods. Experimental results obtained using standard TREC test collections consistently showed that the proposed query-sensitive similarity measure performs better than term-based similarity and existing query-sensitive similarity in the context of Voorhees' nearest neighbor test (NNT).