The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Global partial orders from sequential data
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Rank aggregation methods for the Web
Proceedings of the 10th international conference on World Wide Web
Relevance weighting for query independent evidence
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Finding partial orders from unordered 0-1 data
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Static score bucketing in inverted indexes
Proceedings of the 14th ACM international conference on Information and knowledge management
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Algorithms for discovering bucket orders from data
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
SIAM Journal on Discrete Mathematics
Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
Discovering bucket orders from full rankings
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Introduction to Information Retrieval
Introduction to Information Retrieval
Aggregating inconsistent information: Ranking and clustering
Journal of the ACM (JACM)
A randomized approximation algorithm for computing bucket orders
Information Processing Letters
Ordering by weighted number of wins gives a good ranking for weighted tournaments
ACM Transactions on Algorithms (TALG)
Hi-index | 0.00 |
We study the problem of aggregating and summarizing partial orders, on a large scale. Our motivation is two-fold: to discover elements at similar preference levels and to reduce the number of bits needed to store an element's position in a full ranking.We proceed in two steps: first, we find a total order by linearizing the rankings induced by the multiple partial orders and removing potentially inconsistent pairwise preferences. Next, given a total order, we introduce and formalize the rank quantization problem, which intuitively aims to bucketize the total order in a manner that mostly preserves the relations appearing in the partial orders. We show an exact quadratic-time quantization algorithm, as well as a greedy 2/3-approximation algorithm whose running is substantially faster on sparse instances. As an application, we aggregate rankings of top-10 search results over millions of search engine queries, approximately reproducing and then efficiently encoding the underlying static ranks used by the engine. We evaluate the performance of our algorithms on a web dataset of 12 million(2^{23.5}) unique pages and show that we can quantize the pages' static ranks using as few as eight bits, with only a minor degradation in search quality.