High Performance Data Mining using Data Cubes on Parallel Computers

Authors:
Affiliations:
Venue:
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Year:
1998

Citing 9
Cited 2

Introduction to parallel computing: design and analysis of algorithms

Introduction to parallel computing: design and analysis of algorithms
Implementing data cubes efficiently

SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
From data mining to knowledge discovery: an overview

Advances in knowledge discovery and data mining
A Case Study of Software Process Improvement During Development

IEEE Transactions on Software Engineering
Data-Driven Discovery of Quantitative Rules in Relational Databases

IEEE Transactions on Knowledge and Data Engineering
Efficient Organization of Large Multidimensional Arrays

Proceedings of the Tenth International Conference on Data Engineering
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Discovery of Multiple-Level Association Rules from Large Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Parallel Data Cube Construction for High Performance On-Line Analytical Processing

HIPC '97 Proceedings of the Fourth International Conference on High-Performance Computing

Scatter and gather operations on an asynchronous communication model

SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
Parallel Sparse Supports for Array Intrinsic Functions of Fortran 90

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

On-Line Analytical Processing techniques are used for data analysis and decision support systems. The multidimensionality of the underlying data is well represented by multidimensional databases. For data mining in knowledge discovery, OLAP calculations can be effectively used. For these, high performance parallel systems are required to provide interactive analysis.Precomputed aggregate calculations in a Data Cube can provide efficient query processing for OLAP applications. In this article, we present parallel data cube construction on distributed-memory parallel computers from a relational database. Data Cube is used for data mining of associations using Attribute Focusing. Results are presented for these on the IBM-SP2, which show that our algorithms and techniques are scalable to a large number of processors, providing a high performance platform for such applications.