Ranking retrieval systems without relevance judgments
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Beyond independent relevance: methods and evaluation metrics for subtopic retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A new rank correlation coefficient for information retrieval
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A comparative analysis of cascade measures for novelty and diversity
Proceedings of the fourth ACM international conference on Web search and data mining
Evaluating diversified search results using per-intent graded relevance
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Intent-based diversification of web search results: metrics and algorithms
Information Retrieval
Multiple testing in statistical analysis of systems-based information retrieval experiments
ACM Transactions on Information Systems (TOIS)
Summary of the NTCIR-10 INTENT-2 task: subtopic mining and search result diversification
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Summary of the NTCIR-10 INTENT-2 task: subtopic mining and search result diversification
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
To construct a diversified search test collection, a set of possible subtopics (or intents) needs to be determined for each topic, in one way or another, and perintent relevance assessments need to be obtained. In the TREC Web Track Diversity Task, subtopics are manually developed at NIST, based on results of automatic click log analysis; in the NTCIR INTENT Task, intents are determined by manually clustering 'subtopics strings' returned by participating systems. In this study, we address the following research question: Does the choice of intents for a test collection affect relative performances of diversified search systems? To this end, we use the TREC 2012 Web Track Diversity Task data and the NTCIR-10 INTENT-2 Task data, which share a set of 50 topics but have different intent sets. Our initial results suggest that the choice of intents may affect relative performances, and that this choice may be far more important than how many intents are selected for each topic