Selection and pruning algorithms for bitmap index selection problem using data mining

Authors:
Ladjel Bellatreche;Rokia Missaoui;Hamid Necir;Habiba Drias
Affiliations:
Poitiers University, LISI/ENSMA France;University of Quebec in Outaouais, Canada;Institut National d'Informatique, Algerie;Institut National d'Informatique, Algerie
Venue:
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Year:
2007

Citing 15
Cited 4

Join indices

ACM Transactions on Database Systems (TODS)
Multi-table joins through bitmapped join indices

ACM SIGMOD Record
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Selectivity estimation using probabilistic models

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Physical Database Design for Data Warehouses

ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Discovering Frequent Closed Itemsets for Association Rules

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Performance Measurements of Compressed Bitmap Indices

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Multi-Dimensional Database Allocation for Parallel Data Warehouses

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
DB2 Advisor: An Optimizer Smart Enough to Recommend its own Indexes

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Cost-driven vertical class partitioning for methods in object oriented databases

The VLDB Journal — The International Journal on Very Large Data Bases
Index Selection for Databases: A Hardness Study and a Principled Heuristic Solution

IEEE Transactions on Knowledge and Data Engineering
View materialization vs. indexing: balancing space constraints in data warehouse design

CAiSE'03 Proceedings of the 15th international conference on Advanced information systems engineering
Automatic selection of bitmap join indexes in data warehouses

DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery

Histogram-aware sorting for enhanced word-aligned compression in bitmap indexes

Proceedings of the ACM 11th international workshop on Data warehousing and OLAP
An index selection method without repeated optimizer estimations

Information Sciences: an International Journal
Sorting improves word-aligned bitmap indexes

Data & Knowledge Engineering
An improved approach for automatic selection of multi-tables indexes in ralational data warehouses using maximal frequent itemsets

Intelligent Decision Technologies

Quantified Score

Hi-index	0.00

Visualization

Abstract

Indexing schemes are redundant structures offered by DBMSs to speed up complex queries. Two types of indices are available: monoattribute indices (B-tree, bitmap, hash, etc.) and multi-attribute indices (join indices, bitmap join indices). In relational data warehouses, bitmap join indices (BJIs) are bitmap indices for optimizing star join queries through bit-wise operations. They can be used to avoid actual joins of tables, or to greatly reduce the volume of data that must be joined, by executing restrictions in advance. BJIs are defined using non-key dimension attributes and fact key attributes. Moreover, the problem of selecting these indices is difficult because there is a large number of candidate attributes (defined on dimension tables) that could participate in building these indices. To reduce this complexity, we propose an approach which first prunes the search space of this problemusing data mining techniques, and then based on the new search space, it uses a greedy algorithmto select BJIs that minimize the cost of executing a set of queries and satisfy a storage constraint. The main peculiarity of our pruning approach, compared to the existing ones (that use only appearance frequencies of indexable attributes appearing in queries as a pruning metric), is that it uses others parameters such as the size of their dimension tables, the length of each tuple and the size of a disk page.