CORADD: correlation aware database designer for materialized views and indexes

Authors:
Hideaki Kimura;George Huo;Alexander Rasin;Samuel Madden;Stanley B. Zdonik
Affiliations:
Brown University;Google, Inc.;Brown University;MIT CSAIL;Brown University
Venue:
Proceedings of the VLDB Endowment
Year:
2010

Citing 13
Cited 7

Towards estimation error guarantees for distinct values

PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Automated Selection of Materialized Views and Indexes in SQL Databases

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports

Proceedings of the 27th International Conference on Very Large Data Bases
An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Index Merging

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
CORDS: automatic discovery of correlations and soft functional dependencies

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Selected Topics in Column Generation

Operations Research
k-means++: the advantages of careful seeding

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Processing star queries on hierarchically-clustered fact tables

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
BHUNT: automatic discovery of Fuzzy algebraic constraints in relational data

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Adjoined Dimension Column Clustering to Improve Data Warehouse Query Performance

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
An Integer Linear Programming Approach to Database Design

ICDEW '07 Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop
Correlation maps: a compressed access method for exploiting soft functional dependencies

Proceedings of the VLDB Endowment

On simplifying integrated physical database design

ADBIS'11 Proceedings of the 15th international conference on Advances in databases and information systems
Divergent physical design tuning for replicated databases

SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Optimizing index deployment order for evolving OLAP

Proceedings of the 15th International Conference on Extending Database Technology
DMVI: a dynamic materialized view index for efficiently discovering usable views for progressive queries

CASCON '12 Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research
An integer programming approach for the view and index selection problem

Data & Knowledge Engineering
An automatic physical design tool for clustered column-stores

Proceedings of the 16th International Conference on Extending Database Technology
UpSizeR: Synthetically scaling an empirical relational database

Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We describe an automatic database design tool that exploits correlations between attributes when recommending materialized views (MVs) and indexes. Although there is a substantial body of related work exploring how to select an appropriate set of MVs and indexes for a given workload, none of this work has explored the effect of correlated attributes (e.g., attributes encoding related geographic information) on designs. Our tool identifies a set of MVs and secondary indexes such that correlations between the clustered attributes of the MVs and the secondary indexes are enhanced, which can dramatically improve query performance. It uses a form of Integer Linear Programming (ILP) called ILP Feedback to pick the best set of MVs and indexes for given database size constraints. We compare our tool with a state-of-the-art commercial database designer on two workloads, APB-1 and SSB (Star Schema Benchmark---similar to TPC-H). Our results show that a correlation-aware database designer can improve query performance up to 6 times within the same space budget when compared to a commercial database designer.