Simulated annealing: theory and applications
Simulated annealing: theory and applications
Efficient sampling strategies for relational database operations
ICDT Selected papers of the 4th international conference on Database theory
A sample set condensation algorithm for the class sensitive artificial neural network
Pattern Recognition Letters
An overview of query optimization in relational systems
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
System R: relational approach to database management
ACM Transactions on Database Systems (TODS)
Eddies: continuously adaptive query processing
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Principles of data mining
Machine Learning
Access path selection in a relational database management system
SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Optimization of Nonrecursive Queries
VLDB '86 Proceedings of the 12th International Conference on Very Large Data Bases
Measuring the Complexity of Join Enumeration in Query Optimization
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
The Volcano Optimizer Generator: Extensibility and Efficient Search
Proceedings of the Ninth International Conference on Data Engineering
Exploiting Punctuation Semantics in Continuous Data Streams
IEEE Transactions on Knowledge and Data Engineering
Bell numbers, their relatives, and algebraic differential equations
Journal of Combinatorial Theory Series A
An initial study of overheads of eddies
ACM SIGMOD Record
Adaptive ordering of pipelined stream filters
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Exploiting Correlated Attributes in Acquisitional Query Processing
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Towards a robust query optimizer: a principled and practical approach
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Content-based routing: different plans for different data
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Selectivity-based partitioning: a divide-and-union paradigm for effective query optimization
Proceedings of the 14th ACM international conference on Information and knowledge management
The CQL continuous query language: semantic foundations and query execution
The VLDB Journal — The International Journal on Very Large Data Bases
Maximizing the output rate of multi-way join queries over streaming information sources
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Lifting the burden of history from adaptive query processing
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
CAPE: continuous query engine with heterogeneous-grained adaptivity
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Foundations and Trends in Databases
Self-tuning query mesh for adaptive multi-route query processing
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Hi-index | 0.00 |
A modern query optimizer typically picks a single query plan for all data based on overall data statistics. However, many have observed that real-life datasets tend to have non-uniform distributions. Selecting a single query plan may result in ineffective query execution for possibly large portions of the actual data. In addition most stream query processing systems, given the volume of data, cannot precisely model the system state much less account for uncertainty due to continuous variations. Such systems select a single query plan based upon imprecise statistics. In this paper, we present ''Query Mesh'' (or QM), a practical alternative to state-of-the-art data stream processing approaches. The main idea of QM is to compute multiple routes (i.e., query plans), each designed for a particular subset of the data with distinct statistical properties. We use terms ''plans'' and ''routes'' interchangeably in our work. A classifier model is induced and used to assign the best route to process incoming tuples based upon their data characteristics. We formulate the QM search space and analyze its complexity. Due to the substantial search space, we propose several cost-based query optimization heuristics designed to effectively find nearly optimal QMs. We propose the Self-Routing Fabric (SRF) infrastructure that supports query execution with multiple plans without physically constructing their topologies nor using a central router like Eddy. We also consider how to support uncertain route specification and execution in QM which can occur when imprecise statistics lead to more than one optimal route for a subset of data. Our experimental results indicate that QM consistently provides better query execution performance and incurs negligible overhead compared to the alternative state-of-the-art data stream approaches.