HLT '02 Proceedings of the second international conference on Human Language Technology Research
Exploiting query reformulations for web search result diversification
Proceedings of the 19th international conference on World wide web
Selectively diversifying web search results
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Result diversification based on query-specific cluster ranking
Journal of the American Society for Information Science and Technology
Diagnostic Evaluation of Information Retrieval Models
ACM Transactions on Information Systems (TOIS)
A query performance analysis for result diversification
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Diversity by proportionality: an election-based approach to search result diversification
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Coverage-based search result diversification
Information Retrieval
Hi-index | 0.00 |
Search result diversification aims to maximize the coverage of different pieces of relevant information in the search results. Many diversification methods have been proposed and studied. However, the advantage and disadvantage of each method still remain unclear. In this paper, we conduct a diagnostic study over two state of the art diversification methods with the goal of identifying the weaknesses of these methods to further improve the performance. Specifically, we design a set of perturbation tests that isolate individual factors, i.e., relevance and diversity, which affect the diversification performance. The test results are expected to provide insights on how well each method deals with these factors in the diversification process. Experimental results suggest that some methods perform better in queries whose originally retrieved documents are more relevant to the query while other methods perform better when the documents are more diversified. We therefore propose methods to combine these existing methods based on the predicted factor of the query. The experimental results show that the combined methods can outperform individual methods on TREC collections.