Parallel mining algorithms for generalized association rules with classification hierarchy

Authors:
Takahiko Shintani;Masaru Kitsuregawa
Affiliations:
Institute of Industrial Science, The University of Tokyo;Institute of Industrial Science, The University of Tokyo
Venue:
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Year:
1998

Citing 10
Cited 30

Efficient parallel data mining for association rules

CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
Scalable parallel data mining for association rules

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Hash based parallel algorithms for mining association rules

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
A fast distributed algorithm for mining association rules

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Efficient Mining of Association Rules in Distributed Databases

IEEE Transactions on Knowledge and Data Engineering
Parallel Mining of Association Rules

IEEE Transactions on Knowledge and Data Engineering
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Mining Algorithms for Sequential Patterns in Parallel: Hash Based Approach

PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining

Dynamic remote memory acquisition for parallel data mining on ATM-connected PC cluster

ICS '99 Proceedings of the 13th international conference on Supercomputing
Scalable algorithms for mining large databases

KDD '99 Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
High performance data mining (tutorial PM-3)

Tutorial notes of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Web community mining and web log mining: commodity cluster based execution

ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
Discovering calendar-based temporal association rules

Data & Knowledge Engineering - Special issue: Temporal representation and reasoning
Synthesizing High-Frequency Rules from Different Data Sources

IEEE Transactions on Knowledge and Data Engineering
Web Mining Is Parallel

HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
Parallel Data Mining on Large Scale PC Cluster

WAIM '00 Proceedings of the First International Conference on Web-Age Information Management
Dynamic Load Balancing for Parallel Association Rule Mining on Heterogenous PC Cluster Systems

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
SQL Based Association Rule Mining Using Commercial RDBMS (IBM DB2 UBD EEE)

DaWaK 2000 Proceedings of the Second International Conference on Data Warehousing and Knowledge Discovery
Parallel SQL Based Association Rule Mining on Large Scale PC Cluster: Performance Comparison with Directly Coded C Implementation

PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
Performance Analysis for Parallel Generalized Association Rule Mining on a Large Scale PC Cluster

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Parallel Generalized Association Rule Mining on Large Scale PC Cluster

Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
Discovering associations in very large databases by approximating

Acta Cybernetica
Mining dynamic databases by weighting

Acta Cybernetica
Forecasting Association Rules Using Existing Data Sets

IEEE Transactions on Knowledge and Data Engineering
FP-tax: tree structure based generalized association rule mining

Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Efficient mining of both positive and negative association rules

ACM Transactions on Information Systems (TOIS)
A Super-Programming Approach for Mining Association Rules in Parallel on PC Clusters

IEEE Transactions on Parallel and Distributed Systems
An efficient strategy for mining exceptions in multi-databases

Information Sciences: an International Journal
Database classification for multi-database mining

Information Systems
Dynamic Association Rule Mining using Genetic Algorithms

Intelligent Data Analysis
Distributed and Shared Memory Algorithm for Parallel Mining of Association Rules

MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
Mining globally interesting patterns from multiple databases using kernel estimation

Expert Systems with Applications: An International Journal
Building user argumentative models

Applied Intelligence
A new dynamic load balancing technique for parallel modified PrefixSpan with distributed worker paradigm and its performance evaluation

ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Association rule mining: models and algorithms

Association rule mining: models and algorithms
Preknowledge-based generalized association rules mining

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology
Generalized association rule mining using an efficient data structure

Expert Systems with Applications: An International Journal
A distributed recommender system architecture

International Journal of Web Engineering and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Association rule mining recently attracted strong attention. Usually, the classification hierarchy over the data items is available. Users are interested in generalized association rules that span different levels of the hierarchy, since sometimes more interesting rules can be derived by taking the hierarchy into account.In this paper, we propose the new parallel algorithms for mining association rules with classification hierarchy on a shared-nothing parallel machine to improve its performance. Our algorithms partition the candidate itemsets over the processors, which exploits the aggregate memory of the system effectively. If the candidate itemsets are partitioned without considering classification hierarchy, both the items and its all the ancestor items have to be transmitted, that causes prohibitively large amount of communications. Our method minimizes interprocessor communication by considering the hierarchy. Moreover, in our algorithm, the available memory space is fully utilized by identifying the frequently occurring candidate itemsets and copying them over all the processors, through which frequent itemsets can be processed locally without any communication. Thus it can effectively reduce the load skew among the processors. Several experiments are done by changing the granule of copying itemsets, from the whole tree, to the small group of the frequent itemsets along the hierarchy. The coarser the grain, the easier the control but it is rather difficult to achieve the sufficient load balance. The finer the grain, the more complicated the control is required but it can balance the load quite well.We implemented proposed algorithms on IBM SP-2. Performance evaluations show that our algorithms are effective for handling skew and attain sufficient speedup ratio.