A study on two geometric location problems
Information Processing Letters
The use of MMR, diversity-based reranking for reordering documents and producing summaries
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A PTAS for the multiple knapsack problem
SODA '00 Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms
Less is more: probabilistic models for retrieving fewer relevant documents
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Learning diverse rankings with multi-armed bandits
Proceedings of the 25th international conference on Machine learning
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Efficient Computation of Diverse Query Results
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
A risk minimization framework for information retrieval
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Detecting high log-densities: an O(n¼) approximation for densest k-subgraph
Proceedings of the forty-second ACM symposium on Theory of computing
Structured annotations of web queries
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Result enrichment in commerce search using browse trails
Proceedings of the fourth ACM international conference on Web search and data mining
Approximation algorithms for maximum dispersion
Operations Research Letters
Diversity maximization under matroid constraints
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.01 |
In commerce search, the set of products returned by a search engine often forms the basis for all user interactions leading up to a potential transaction on the web. Such a set of products is known as the consideration set. In this study, we consider the problem of generating consideration set of products in commerce search so as to maximize user satisfaction. One of the key features of commerce search that we exploit in our study is the association of a set of important attributes with the products and a set of specified attributes with the user queries. Those important attributes not used in the query are treated as unspecified. The attribute space admits a natural definition of user satisfaction via user preferences on the attributes and their values, viz. require that the surfaced products be close to the specified attribute values in the query, and diverse with respect to the unspecified attributes. We model this as a general Max-Sum Dispersion problem wherein we are given a set of n nodes in a metric space and the objective is to select a subset of nodes with total cost at most a given budget, and maximize the sum of the pairwise distances between the selected nodes. In our setting, each node denotes a product, the cost of a node being inversely proportional to its relevance with respect to specified attributes. The distance between two nodes quantifies the diversity with respect to the unspecified attributes. The problem is NP-hard and a 2-approximation was previously known only when all the nodes have unit cost. In our setting, we do not make any assumptions on the cost. We label this problem as the General Max-Sum Dispersion problem. We give the first constant factor approximation algorithm for this problem, achieving an approximation ratio of 2. Further, we perform extensive empirical analysis on real-world data to show the effectiveness of our algorithm.