A threshold of ln n for approximating set cover
Journal of the ACM (JACM)
The use of MMR, diversity-based reranking for reordering documents and producing summaries
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Algorithms for facility location problems with outliers
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Similarity estimation techniques from rounding algorithms
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Proceedings of the 13th international conference on World Wide Web
Being accurate is not enough: how accuracy metrics have hurt recommender systems
CHI '06 Extended Abstracts on Human Factors in Computing Systems
Précis: The Essence of a Query Answer
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Improving personalized web search using result diversification
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Extracting redundancy-aware top-k patterns
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Addressing diverse user preferences in SQL-query-result navigation
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Generating diverse and representative image search results for landmarks
Proceedings of the 17th international conference on World Wide Web
An axiomatic approach for result diversification
Proceedings of the 18th international conference on World wide web
Efficient Computation of Diverse Query Results
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Recommendation Diversification Using Explanations
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Turning down the noise in the blogosphere
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Actively predicting diverse search intent from user browsing behaviors
Proceedings of the 19th international conference on World wide web
Diversifying web search results
Proceedings of the 19th international conference on World wide web
Exploiting query reformulations for web search result diversification
Proceedings of the 19th international conference on World wide web
Consideration set generation in commerce search
Proceedings of the 20th international conference on World wide web
Efficient diversity-aware search
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
On query result diversification
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Online selection of diverse results
Proceedings of the fifth ACM international conference on Web search and data mining
Approximation algorithms for maximum dispersion
Operations Research Letters
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Aggregator websites typically present documents in the form of representative clusters. In order for users to get a broader perspective, it is important to deliver a diversified set of representative documents in those clusters. One approach to diversification is to maximize the average dissimilarity among documents. Another way to capture diversity is to avoid showing several documents from the same category (e.g. from the same news channel). We combine the above two diversification concepts by modeling the latter approach as a (partition) matroid constraint, and study diversity maximization problems under matroid constraints. We present the first constant-factor approximation algorithm for this problem, using a new technique. Our local search 0.5-approximation algorithm is also the first constant-factor approximation for the max-dispersion problem under matroid constraints. Our combinatorial proof technique for maximizing diversity under matroid constraints uses the existence of a family of Latin squares which may also be of independent interest. In order to apply these diversity maximization algorithms in the context of aggregator websites and as a preprocessing step for our diversity maximization tool, we develop greedy clustering algorithms that maximize weighted coverage of a predefined set of topics. Our algorithms are based on computing a set of cluster centers, where clusters are formed around them. We show the better performance of our algorithms for diversity and coverage maximization by running experiments on real (Twitter) and synthetic data in the context of real-time search over micro-posts. Finally we perform a user study validating our algorithms and diversity metrics.