Efficient query processing for multi-dimensionally clustered tables in DB2

Authors:
Bishwaranjan Bhattacharjee;Sriram Padmanabhan;Timothy Malkemus;Tony Lai;Leslie Cranston;Matthew Huras
Affiliations:
IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM T. J. Watson Research Center, Hawthorne, NY;IBM Toronto Laboratories, Markham, Ontario, Canada;IBM Toronto Laboratories, Markham, Ontario, Canada;IBM Toronto Laboratories, Markham, Ontario, Canada
Venue:
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Year:
2003

Citing 1
Cited 10

Multi-dimensional clustering: a new data layout scheme in DB2

Proceedings of the 2003 ACM SIGMOD international conference on Management of data

Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more

Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more
Star join revisited: Performance internals for cluster architectures

Data & Knowledge Engineering
Automated design of multidimensional clustering tables for relational databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient bulk deletes for multi dimensional clustered tables in DB2

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Increasing buffer-locality for multiple index based scans through intelligent placement and index scan speed control

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
SOR: a practical system for ontology storage, reasoning and search

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Effective and efficient semantic web data management over DB2

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Intelligent Data Granulation on Load: Improving Infobright's Knowledge Grid

FGIT '09 Proceedings of the 1st International Conference on Future Generation Information Technology
Lightweight integration of IR and DB for scalable hybrid search with integrated ranking support

Web Semantics: Science, Services and Agents on the World Wide Web
Efficient evaluation of partially-dimensional range queries using adaptive r*-tree

DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We have introduced a Multi-Dimensional Clustering (MDC) physical layout scheme in DB2 version 8.0 for relational tables. Multi-Dimensional Clustering is based on the definition of one or more orthogonal clustering attributes (or expressions) of a table. The table is organized physically by associating records with similar values for the dimension attributes in a cluster. Each clustering key is allocated one or more blocks of physical storage with the aim of storing the multiple records belonging to the cluster in almost contiguous fashion. Block oriented indexes are created to access these blocks. In this paper, we describe novel techniques for query processing operations that provide significant performance improvements for MDC tables. Current database systems employ a repertoire of access methods including table scans, index scans, index ANDing, and index ORing. We have extended these access methods for efficiently processing the block based MDC tables. One important concept at the core of processing MDC tables is the block oriented access technique. In addition, since MDC tables can include regular record oriented indexes, we employ novel techniques to combine block and record indexes. Block oriented processing is extended to nested loop joins and star joins as well. We show results from experiments using a star-schema database to validate our claims of performance with minimal overhead.