Using association rules for product assortment decisions: a case study
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient k-NN search on vertically decomposed data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Minimal probing: supporting expensive predicates for top-k queries
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Top-k selection queries over relational databases: Mapping strategies and performance evaluation
ACM Transactions on Database Systems (TODS)
Introduction to Algorithms
Optimizing Multi-Feature Queries for Image Databases
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Supporting Incremental Join Queries on Ranked Inputs
Proceedings of the 27th International Conference on Very Large Data Bases
Query Processing Issues in Image(Multimedia) Databases
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Optimal aggregation algorithms for middleware
Journal of Computer and System Sciences - Special issu on PODS 2001
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Towards Efficient Multi-Feature Queries in Heterogeneous Environments
ITCC '01 Proceedings of the International Conference on Information Technology: Coding and Computing
Evaluating Top-k Queries over Web-Accessible Databases
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Evaluating top-k queries over web-accessible databases
ACM Transactions on Database Systems (TODS)
On the integration of structure indexes and inverted lists
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Optimizing Top-k Selection Queries over Multimedia Repositories
IEEE Transactions on Knowledge and Data Engineering
Efficient top-K query calculation in distributed networks
Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
Progressive Distributed Top-k Retrieval in Peer-to-Peer Networks
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
TinyDB: an acquisitional query processing system for sensor networks
ACM Transactions on Database Systems (TODS) - Special Issue: SIGMOD/PODS 2003
RankSQL: query algebra and optimization for relational top-k queries
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Improving collection selection with overlap awareness in P2P search engines
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
The threshold join algorithm for top-k queries in distributed sensor networks
DMSN '05 Proceedings of the 2nd international workshop on Data management for sensor networks
KLEE: a framework for distributed top-k query algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Efficient Query Evaluation on Large Textual Collections in a Peer-to-Peer Environment
P2P '05 Proceedings of the Fifth IEEE International Conference on Peer-to-Peer Computing
Proceedings of the 15th international conference on World Wide Web
Reducing network traffic in unstructured P2P systems using Top-k queries
Distributed and Parallel Databases
Pruned query evaluation using pre-computed impacts
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Answering top-k queries using views
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Adaptive rank-aware query optimization in relational databases
ACM Transactions on Database Systems (TODS)
Progressive and selective merge: computing top-k with ad-hoc ranking functions
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Spark: top-k keyword query in relational databases
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
The history of histograms (abridged)
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Optimized query execution in large search engines with global page ordering
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Top-k query evaluation with probabilistic guarantees
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Approximate NN queries on streams with guaranteed error/performance bounds
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Probabilistic ranking of database query results
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Ad-hoc top-k query answering for data streams
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Depth estimation for ranking query optimization
The VLDB Journal — The International Journal on Very Large Data Bases
Efficient processing of distributed top-k queries
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
Consistent query answers in inconsistent probabilistic databases
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Cardinality estimation and dynamic length adaptation for Bloom filters
Distributed and Parallel Databases
Top-k vectorial aggregation queries in a distributed environment
Journal of Parallel and Distributed Computing
Peer-to-peer web search: euphoria, achievements, disillusionment, and future opportunities
From active data management to event-based systems and more
Efficient query answering in probabilistic RDF graphs
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
k-nearest keyword search in RDF graphs
Web Semantics: Science, Services and Agents on the World Wide Web
Hi-index | 0.00 |
Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network.