Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
The cluster hypothesis revisited
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
Novelty and diversity in information retrieval evaluation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the Second ACM International Conference on Web Search and Data Mining
A New Measure of the Cluster Hypothesis
ICTIR '09 Proceedings of the 2nd International Conference on Theory of Information Retrieval: Advances in Information Retrieval Theory
Expected reciprocal rank for graded relevance
Proceedings of the 18th ACM conference on Information and knowledge management
A comparative analysis of cascade measures for novelty and diversity
Proceedings of the fourth ACM international conference on Web search and data mining
Hi-index | 0.00 |
Current measures of novelty and diversity in information retrieval evaluation require explicit subtopic judgments, adding complexity to the manual assessment process. In some sense, these subtopic judgments may be viewed as providing a crude indication of document similarity, since we might expect documents relevant to common subtopics to be more similar on average than documents sharing no common subtopic, even when these documents are relevant to the same overall topic. In this paper, we test this hypothesis using documents and judgments drawn from the TREC 2009 Web Track. Our experiments demonstrate that higher subtopic overlap correlates with higher cosine similarity, providing validation for the use of subtopic judgments and pointing to new possibilities for measuring of novelty and diversity.