Diversity maximization under matroid constraints

Authors:
Zeinab Abbassi;Vahab S. Mirrokni;Mayur Thakur
Affiliations:
Columbia University, New York, NY, USA;Google Research, New York, NY, USA;Google, New York, NY, USA
Venue:
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2013

Citing 24
Cited 0

A threshold of ln n for approximating set cover

Journal of the ACM (JACM)
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Algorithms for facility location problems with outliers

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Similarity estimation techniques from rounding algorithms

STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
A hierarchical monothetic document clustering algorithm for summarization and browsing search results

Proceedings of the 13th international conference on World Wide Web
Being accurate is not enough: how accuracy metrics have hurt recommender systems

CHI '06 Extended Abstracts on Human Factors in Computing Systems
Précis: The Essence of a Query Answer

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Improving personalized web search using result diversification

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Extracting redundancy-aware top-k patterns

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Addressing diverse user preferences in SQL-query-result navigation

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Generating diverse and representative image search results for landmarks

Proceedings of the 17th international conference on World Wide Web
An axiomatic approach for result diversification

Proceedings of the 18th international conference on World wide web
Efficient Computation of Diverse Query Results

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Recommendation Diversification Using Explanations

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Turning down the noise in the blogosphere

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Actively predicting diverse search intent from user browsing behaviors

Proceedings of the 19th international conference on World wide web
Diversifying web search results

Proceedings of the 19th international conference on World wide web
Exploiting query reformulations for web search result diversification

Proceedings of the 19th international conference on World wide web
Consideration set generation in commerce search

Proceedings of the 20th international conference on World wide web
Efficient diversity-aware search

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
On query result diversification

ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Online selection of diverse results

Proceedings of the fifth ACM international conference on Web search and data mining
Approximation algorithms for maximum dispersion

Operations Research Letters
Diversifying top-k results

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

Aggregator websites typically present documents in the form of representative clusters. In order for users to get a broader perspective, it is important to deliver a diversified set of representative documents in those clusters. One approach to diversification is to maximize the average dissimilarity among documents. Another way to capture diversity is to avoid showing several documents from the same category (e.g. from the same news channel). We combine the above two diversification concepts by modeling the latter approach as a (partition) matroid constraint, and study diversity maximization problems under matroid constraints. We present the first constant-factor approximation algorithm for this problem, using a new technique. Our local search 0.5-approximation algorithm is also the first constant-factor approximation for the max-dispersion problem under matroid constraints. Our combinatorial proof technique for maximizing diversity under matroid constraints uses the existence of a family of Latin squares which may also be of independent interest. In order to apply these diversity maximization algorithms in the context of aggregator websites and as a preprocessing step for our diversity maximization tool, we develop greedy clustering algorithms that maximize weighted coverage of a predefined set of topics. Our algorithms are based on computing a set of cluster centers, where clusters are formed around them. We show the better performance of our algorithms for diversity and coverage maximization by running experiments on real (Twitter) and synthetic data in the context of real-time search over micro-posts. Finally we perform a user study validating our algorithms and diversity metrics.