On identifying representative relevant documents

Authors:
Fiana Raiber;Oren Kurland
Affiliations:
Technion - Israel Institute of Technology, Haifa, Israel;Technion - Israel Institute of Technology, Haifa, Israel
Venue:
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Year:
2010

Citing 31
Cited 2

Relevance feedback revisited

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
The effect of accessing nonmatching documents on relevance feedback

ACM Transactions on Information Systems (TOIS)
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
A general language model for information retrieval (poster abstract)

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Information Retrieval

Information Retrieval
Predicting query performance

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
The effectiveness of query-specific hierarchic clustering in information retrieval

Information Processing and Management: an International Journal
Beyond independent relevance: methods and evaluation metrics for subtopic retrieval

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Language Modeling for Information Retrieval

Language Modeling for Information Retrieval
A survey on the use of relevance feedback for information access systems

The Knowledge Engineering Review
A review of relevance feedback experiments at the 2003 reliable information access (RIA) workshop.

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Active feedback in ad hoc information retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval system evaluation: effort, sensitivity, and reliability

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
PageRank without hyperlinks: structural re-ranking using links induced by language models

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Poison pills: harmful relevant documents in feedback

Proceedings of the 14th ACM international conference on Information and knowledge management
Regularizing ad hoc retrieval scores

Proceedings of the 14th ACM international conference on Information and knowledge management
Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Training linear SVMs in linear time

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Inter-document similarities, language models, and ad hoc information retrieval

Inter-document similarities, language models, and ad hoc information retrieval
Reliable information retrieval evaluation with incomplete and biased judgements

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A comparison of statistical significance tests for information retrieval evaluation

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
The opposite of smoothing: a language model approach to ranking query-specific document clusters

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Adaptive relevance feedback in information retrieval

Proceedings of the 18th ACM conference on Information and knowledge management
Evaluating text representations for retrieval of the best group of documents

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval

On bias problem in relevance feedback

Proceedings of the 20th ACM international conference on Information and knowledge management
Predicting Query Performance by Query-Drift Estimation

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Using relevance feedback can significantly improve the effectiveness of ad hoc (query-based) retrieval. However, retrieval performance can significantly vary with respect to the given set of relevant documents. Our goal is to establish a quantitative analysis of what makes a relevant document a good representative of the relevant-documents set regardless of the retrieval approach employed. That is, we would like to estimate the extent to which a relevant document can effectively help in finding (other) relevant documents using some relevance-feedback method employed over the corpus. We present various representativeness estimates; some of which treat documents independently and some utilize inter-document similarities. Empirical evaluation shows that relevant documents that are centrally located within the similarity space of the relevant-documents set tend to be good representatives. In addition, we show that there exist highly representative clusters of similar relevant documents, and devise methods for ranking clusters based on their presumed representativeness. Finally, we study the connection between representativeness and TREC's gradual relevance judgments.