A new tool for multi-level partitioning in teradata

Authors:
Young-Kyoon Suh;Ahmad Ghazal;Alain Crolotte;Pekka Kostamaa
Affiliations:
University of Arizona, Tucson, AZ, USA;Teradata Corporation, El Segundo, CA, USA;Teradata Corporation, El Segundo, CA, USA;Teradata Corporation, El Segundo, CA, USA
Venue:
Proceedings of the 21st ACM international conference on Information and knowledge management
Year:
2012

Citing 4
Cited 0

AutoAdmin “what-if” index analysis utility

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Integrating vertical and horizontal partitioning into automated physical database design

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Automated design of multidimensional clustering tables for relational databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Automated partitioning design in parallel database systems

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a new tool that recommends an optimized partitioning solution called Multi-Level Partitioned Primary Index (MLPPI) for a fact table based on the queries in the workload. The tool implements a new technique using a greedy algorithm for search space enumeration. The space is driven by predicates in the queries. This technique fits very well the Teradata MLPPI scheme, as it is based on a general framework using general expressions, ranges and case expressions for partition definitions. The cost model implemented in the tool is based on the Teradata optimizer, and it is used to prune the search space for reaching a final solution. The tool resides completely on the client, and interfaces the database through APIs as opposed to previous work that requires optimizer code extension. The APIs are used to simplify the workload queries, and to capture fact table predicates and costs necessary to make the recommendation. The predicate-driven method implemented by the tool is general, and it can be applied to any clustering or partitioning scheme based on simple field expressions or complex SQL predicates. Experimental results given a particular workload will show that the recommendation from the tool outperforms a human expert. The experiments also show that the solution is scalable both with the workload complexity and the size of the fact table.