Searching distributed collections with inference networks
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Beyond independent relevance: methods and evaluation metrics for subtopic retrieval
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Relevant document distribution estimation method for resource selection
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Evaluating evaluation metrics based on the bootstrap
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Learning query intent from regularized click graphs
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Novelty and diversity in information retrieval evaluation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Rank-biased precision for measurement of retrieval effectiveness
ACM Transactions on Information Systems (TOIS)
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Sources of evidence for vertical selection
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Expected reciprocal rank for graded relevance
Proceedings of the 18th ACM conference on Information and knowledge management
Central-rank-based collection selection in uncooperative distributed information retrieval
ECIR'07 Proceedings of the 29th European conference on IR research
Do user preferences and evaluation measures line up?
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Evaluating whole-page relevance
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A comparative analysis of cascade measures for novelty and diversity
Proceedings of the fourth ACM international conference on Web search and data mining
Proceedings of the fourth ACM international conference on Web search and data mining
A methodology for evaluating aggregated search results
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Evaluating diversified search results using per-intent graded relevance
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Aggregated search result diversification
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Learning to aggregate vertical results into web search results
Proceedings of the 20th ACM international conference on Information and knowledge management
Evaluating large-scale distributed vertical search
Proceedings of the 9th workshop on Large-scale and distributed informational retrieval
Multiple testing in statistical analysis of systems-based information retrieval experiments
ACM Transactions on Information Systems (TOIS)
Evaluation with informational and navigational intents
Proceedings of the 21st international conference on World Wide Web
Assessing and predicting vertical intent for web queries
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Evaluating aggregated search pages
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Federated search in the wild: the combined power of over a hundred search engines
Proceedings of the 21st ACM international conference on Information and knowledge management
Evaluating reward and risk for vertical selection
Proceedings of the 21st ACM international conference on Information and knowledge management
Diversified search evaluation: lessons from the NTCIR-9 INTENT task
Information Retrieval
Hi-index | 0.00 |
Aggregating search results from a variety of diverse verticals such as news, images, videos and Wikipedia into a single interface is a popular web search presentation paradigm. Although several aggregated search (AS) metrics have been proposed to evaluate AS result pages, their properties remain poorly understood. In this paper, we compare the properties of existing AS metrics under the assumptions that (1) queries may have multiple preferred verticals; (2) the likelihood of each vertical preference is available; and (3) the topical relevance assessments of results returned from each vertical is available. We compare a wide range of AS metrics on two test collections. Our main criteria of comparison are (1) discriminative power, which represents the reliability of a metric in comparing the performance of systems, and (2) intuitiveness, which represents how well a metric captures the various key aspects to be measured (i.e. various aspects of a user's perception of AS result pages). Our study shows that the AS metrics that capture key AS components (e.g., vertical selection) have several advantages over other metrics. This work sheds new lights on the further developments and applications of AS metrics.