Exploiting constraint-like data characterizations in query optimization

Authors:
Parke Godfrey;Jarek Gryz;Calisto Zuzarte
Affiliations:
York University, 4700 Keele Street, Toronto, Ontario M3J 1P3, Canada;York University, 4700 Keele Street, Toronto, Ontario M3J 1P3, Canada;IBM Canada Ltd., 1150 Eglinton Ave. E., Toronto, Ontario M3C 1H7, Canada
Venue:
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Year:
2001

Citing 23
Cited 11

Statistical profile estimation in database systems

ACM Computing Surveys (CSUR)
Logic-based approach to semantic query optimization

ACM Transactions on Database Systems (TODS)
Practical selectivity estimation through adaptive sampling

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Deriving production rules for constraint maintenance

Proceedings of the sixteenth international conference on Very large databases
Optimal histograms for limiting worst-case error propagation in the size of join results

ACM Transactions on Database Systems (TODS)
Automatic generation of production rules for integrity maintenance

ACM Transactions on Database Systems (TODS)
Algorithms for inferring functional dependencies from relations

Data & Knowledge Engineering
Fundamental techniques for order optimization

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Improved histograms for selectivity estimation of range predicates

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Using inductive learning to generate rules for semantic query optimization

Advances in knowledge discovery and data mining
Integrity constraints: semantics and applications

Logics for databases and information systems
Automatic Knowledge Acquisition and Maintenance for Semantic Query Optimization

IEEE Transactions on Knowledge and Data Engineering
Learning Transformation Rules for Semantic Query Optimization: A Data-Driven Approach

IEEE Transactions on Knowledge and Data Engineering
A Feasibility and Performance Study of Dependency Inference

Proceedings of the Fifth International Conference on Data Engineering
Using Type Inference and Induced Rules to Provide Intensional Answers

Proceedings of the Seventh International Conference on Data Engineering
Exploiting Uniqueness in Query Optimization

Proceedings of the Tenth International Conference on Data Engineering
Efficient Discovery of Functional and Approximate Dependencies Using Partitions

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Discovery and Application of Check Constraints in DB2

Proceedings of the 17th International Conference on Data Engineering
Implementation of Two Semantic Query Optimization Techniques in DB2 Universal Database

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Knowledge Discovery in Databases: An Attribute-Oriented Approach

VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
Universality of Serial Histograms

VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
Sampling-Based Estimation of the Number of Distinct Values of an Attribute

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
An Approach to Deriving Object Hierarchies from Database Schema and Contents

ISMIS '91 Proceedings of the 6th International Symposium on Methodologies for Intelligent Systems

Improving Query Evaluation with Approximate Functional Dependency Based Decompositions

BNCOD 19 Proceedings of the 19th British National Conference on Databases: Advances in Databases
Query Optimization via Empty Joins

DEXA '02 Proceedings of the 13th International Conference on Database and Expert Systems Applications
Translating advanced integrity checking technology to SQL

Database integrity
Using Constraints to Describe Source Contents in Data Integration Systems

IEEE Intelligent Systems
Holes in joins

Journal of Intelligent Information Systems
On Simplification of Database Integrity Constraints

Fundamenta Informaticae
Query by output

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Correlation maps: a compressed access method for exploiting soft functional dependencies

Proceedings of the VLDB Endowment
Toward a verified relational database management system

Proceedings of the 37th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Query optimization in large databases using association rule mining

Proceedings of the 48th Annual Southeast Regional Conference
On Simplification of Database Integrity Constraints

Fundamenta Informaticae

Quantified Score

Hi-index	0.00

Visualization

Abstract

Query optimizers nowadays draw upon many sources of information about the database to optimize queries. They employ runtime statistics in cost-based estimation of query plans. They employ integrity constraints in the query rewrite process. Primary and foreign key constraints have long played a role in the optimizer, both for rewrite opportunities and for providing more accurate cost predictions. More recently, other types of integrity constraints are being exploited by optimizers in commercial systems, for which certain semantic query optimization techniques have now been implemented.These new optimization strategies that exploit constraints hold the promise for good improvement. Their weakness, however, is that often the “constraints” that would be useful for optimization for a given database and workload are not explicitly available for the optimizer. Data mining tools can find such “constraints” that are true of the database, but then there is the question of how this information can be kept by the database system, and how to make this information available to, and effectively usable by, the optimizer.We present our work on soft constraints in DB2. A soft constraint is a syntactic statement equivalent to an integrity constraint declaration. A soft constraint is not really a constraint, per se, since future updates may undermine it. While a soft constraint is valid, however, it can be used by the optimizer in the same way integrity constraints are. We present two forms of soft constraint: absolute and statistical. An absolute soft constraint is consistent with respect to the current state of the database, just in the same way an integrity constraint must be. They can be used in rewrite, as well as in cost estimation. A statistical soft constraint differs in that it may have some degree of violation with respect to the state of the database. Thus, statistical soft constraints cannot be used in rewrite, but they can still be used in cost estimation.We are working long-term on absolute soft constraints. We discuss the issues involved in implementing a facility for absolute soft constraints in a database system (and in DB2), and the strategies that we are researching. The current DB2 optimizer is more amenable to adding facilities for statistical soft constraints. In the short-term, we have been implementing pathways in the optimizer for statistical soft constraints. We discuss this implementation.