Online selection of diverse results

Authors:
Debmalya Panigrahi;Atish Das Sarma;Gagan Aggarwal;Andrew Tomkins
Affiliations:
Massachusettes Institute of Technology, Cambridge, MA, USA;Google, Mountain View, CA, USA;Google, Mountain View, CA, USA;Google, Mountain View, CA, USA
Venue:
Proceedings of the fifth ACM international conference on Web search and data mining
Year:
2012

Citing 19
Cited 2

Randomized algorithms

Randomized algorithms
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Improving recommendation lists through topic diversification

WWW '05 Proceedings of the 14th international conference on World Wide Web
Being accurate is not enough: how accuracy metrics have hurt recommender systems

CHI '06 Extended Abstracts on Human Factors in Computing Systems
The Santa Claus problem

Proceedings of the thirty-eighth annual ACM symposium on Theory of computing
Diversifying search results

Proceedings of the Second ACM International Conference on Web Search and Data Mining
It takes variety to make a world: diversification in recommender systems

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
An axiomatic approach for result diversification

Proceedings of the 18th international conference on World wide web
Efficient Computation of Diverse Query Results

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Recommendation Diversification Using Explanations

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Turning down the noise in the blogosphere

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Redundancy, diversity and interdependent document relevance

ACM SIGIR Forum
A risk minimization framework for information retrieval

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Exploiting query reformulations for web search result diversification

Proceedings of the 19th international conference on World wide web
Search result diversification

ACM SIGMOD Record
Selectively diversifying web search results

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Near optimal online algorithms and fast approximation algorithms for resource allocation problems

Proceedings of the 12th ACM conference on Electronic commerce
Intent-aware search result diversification

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
How diverse are web search results?

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

DisC diversity: result diversification based on dissimilarity and coverage

Proceedings of the VLDB Endowment
Diversity maximization under matroid constraints

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

The phenomenal growth in the volume of easily accessible information via various web-based services has made it essential for service providers to provide users with personalized representative summaries of such information. Further, online commercial services including social networking and micro-blogging websites, e-commerce portals, leisure and entertainment websites, etc. recommend interesting content to users that is simultaneously diverse on many different axes such as topic, geographic specificity, etc. The key algorithmic question in all these applications is the generation of a succinct, representative, and relevant summary from a large stream of data coming from a variety of sources. In this paper, we formally model this optimization problem, identify its key structural characteristics, and use these observations to design an extremely scalable and efficient algorithm. We analyze the algorithm using theoretical techniques to show that it always produces a nearly optimal solution. In addition, we perform large-scale experiments on both real-world and synthetically generated datasets, which confirm that our algorithm performs even better than its analytical guarantees in practice, and also outperforms other candidate algorithms for the problem by a wide margin.