An improved approach for automatic selection of multi-tables indexes in ralational data warehouses using maximal frequent itemsets

Authors:
B. Ziani;Y. Ouinten
Affiliations:
Department of Mathematics and Computer Science, LIM, Laghouat, Algeria;Department of Mathematics and Computer Science, LIM, Laghouat, Algeria
Venue:
Intelligent Decision Technologies
Year:
2013

Citing 26
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Multi-table joins through bitmapped join indices

ACM SIGMOD Record
Building the data warehouse (2nd ed.)

Building the data warehouse (2nd ed.)
Improved query performance with variant indexes

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Bitmap index design and evaluation

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Microsoft index turning wizard for SQL Server 7.0

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Ubiquitous B-Tree

ACM Computing Surveys (CSUR)
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling

The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
Covering indexes for branching path queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Adaptive and Automated Index Selection in RDBMS

EDBT '92 Proceedings of the 3rd International Conference on Extending Database Technology: Advances in Database Technology
Encoded Bitmap Indexing for Data Warehouses

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Discovering Frequent Closed Itemsets for Association Rules

ICDT '99 Proceedings of the 7th International Conference on Database Theory
Performance Measurements of Compressed Bitmap Indices

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Automated Selection of Materialized Views and Indexes in SQL Databases

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Index Merging

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
DB2 Advisor: An Optimizer Smart Enough to Recommend its own Indexes

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Index Selection for Databases: A Hardness Study and a Principled Heuristic Solution

IEEE Transactions on Knowledge and Data Engineering
Automatic physical database tuning: a relaxation-based approach

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Frequent pattern mining: current status and future directions

Data Mining and Knowledge Discovery
Self-tuning database systems: a decade of progress

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Data mining-based materialized view and index selection in data warehouses

Journal of Intelligent Information Systems
Mining maximal frequent itemsets: a java implementation of FPMAX algorithm

IIT'09 Proceedings of the 6th international conference on Innovations in information technology
Yet another algorithms for selecting bitmap join indexes

DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
Automated Physical Database Design and Tuning

Automated Physical Database Design and Tuning
Selection and pruning algorithms for bitmap index selection problem using data mining

DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

System performance for data warehouses is crucially dependent on its physical design in which one of the most challenging tasks is the selection of an appropriate set of indexes for a representative workload under storage constraint. The problem becomes even more complex for multi-tables indexes such as bitmap join indexes, since it involves searching a vast space of possible configurations. Queries references to attributes and their frequencies play an important role in determining the efficiency of the selected indexes. In this paper, we consider the index selection as a typical frequent itemsets mining problem. The indexes are built with combinations of attributes, viewed as items. The queries in the workload, viewed as transactions, are described by the attributes they involve. The foundation of our approach is the concept of maximal frequent itemsets. This data mining technique helps to discover strong correlations among attributes such that the presence of some attributes in a query will imply the presence of some other attributes. Moreover, by avoiding the generation of redundent indexes, the proposed approach leads to a solution that expresses the set of relevant indexes in a more succinct way. Consequently, it guarantees the reduction of the storage space requirements. Unlike previous approaches that focus on the configuration leading to the minimum workload cost, we suggest to consider a set of optimized solutions and we propose a metric for measuring profit-effectiveness that helps to pick up the most promising one. Through a set of experiments on the ABP-1 benchmark, we show that our approach achieves better performance compared to similar methods, with significant savings in index storage.