Approximating the number of unique values of an attribute without sorting
Information Systems
On estimating the cardinality of the projection of a database relation
ACM Transactions on Database Systems (TODS)
Statistical profile estimation in database systems
ACM Computing Surveys (CSUR)
Estimating the size of generalized transitive closures
VLDB '89 Proceedings of the 15th international conference on Very large data bases
Practical selectivity estimation through adaptive sampling
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Statistical estimators for aggregate relational algebra queries
ACM Transactions on Database Systems (TODS)
Error-constrained COUNT query evaluation in relational databases
SIGMOD '91 Proceedings of the 1991 ACM SIGMOD international conference on Management of data
A Contingency Approach to Estimating Record Selectivities
IEEE Transactions on Software Engineering
Sequential sampling procedures for query size estimation
SIGMOD '92 Proceedings of the 1992 ACM SIGMOD international conference on Management of data
Introduction to artificial neural systems
Introduction to artificial neural systems
Optimal histograms for limiting worst-case error propagation in the size of join results
ACM Transactions on Database Systems (TODS)
On optimal processor allocation to support pipelined hash joins
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Adaptive selectivity estimation using query feedback
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Balancing histogram optimality and practicality for query result size estimation
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Join queries with external text sources: execution and optimization techniques
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Query size estimation by adaptive sampling (extended abstract)
PODS '90 Proceedings of the ninth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
An effective algorithm for parallelizing sort merge joins in the presence of data skew
DPDS '90 Proceedings of the second international symposium on Databases in parallel and distributed systems
A Hybrid Neural Network Model for Solving Optimization Problems
IEEE Transactions on Computers
An Evaluation of Sampling-Based Size Estimation Methods for Selections in Database Systems
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Optimizing Queries with Materialized Views
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Sampling-Based Selectivity Estimation for Joins Using Augmented Frequent Value Statistics
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Analytic-based estimation of query result sizes
AIKED'05 Proceedings of the 4th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering Data Bases
Adaptive holistic scheduling for query processing in sensor networks
Journal of Parallel and Distributed Computing
Data Quality of Query Results with Generalized Selection Conditions
Operations Research
Hi-index | 0.00 |
Traditional sampling-based estimators infer the actual selectivity of a query based purely on runtime information gathering, excluding the previously collected information, which underutilizes the information available. Table-based and parametric estimators extrapolate the actual selectivity of a query based only on the previously collected information, ignoring on-line information, which results in inaccurate estimation in a frequently updated environment. We propose a novel hybrid estimator that utilizes and optimally combines the on-line and previously collected information. Theoretical analysis demonstrates that the on-line and previously collected information is complementary and that the comprehensive utilization of the on-line and previously collected information is of value for further performance improvement. Our theoretical results are validated by a comprehensive experimental study using a practical database, in the presence of insert, delete, and update operations. The hybrid approach is very promising in the sense that it provides the adaptive mechanism that allows the optimal combination of information obtained from different sources in order to achieve a higher estimation accuracy and reliability.