Parallel Sequence Mining on Shared-Memory Machines

Authors:
Mohammed Javeed Zaki
Affiliations:
-
Venue:
Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
Year:
1999

Citing 13
Cited 2

Efficient parallel data mining for association rules

CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
Scalable parallel data mining for association rules

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Efficient enumeration of frequent sequences

Proceedings of the seventh international conference on Information and knowledge management
Parallel data mining for association rules on shared-memory multi-processors

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Hash based parallel algorithms for mining association rules

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Discovery of Frequent Episodes in Event Sequences

Data Mining and Knowledge Discovery
Parallel Algorithms for Discovery of Association Rules

Data Mining and Knowledge Discovery
Efficient Mining of Association Rules in Distributed Databases

IEEE Transactions on Knowledge and Data Engineering
Parallel Mining of Association Rules

IEEE Transactions on Knowledge and Data Engineering
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Knowledge Discovery from Telecommunication Network Alarm Databases

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Mining Algorithms for Sequential Patterns in Parallel: Hash Based Approach

PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining

A Requirements Analysis for Parallel KDD Systems

IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Exploiting efficient parallelism for mining rules in time series data

HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present pSPADE, a parallel algorithm for fast discovery of frequent sequences in large databases. pSPADE decomposes the original search space into smaller suffix-based classes. Each class can be solved in main-memory using efficient search techniques, and simple join operations. Further each class can be solved independently on each processor requiring no synchronization. However, dynamic inter-class and intra-class load balancing must be exploited to ensure that each processor gets an equal amount of work. Experiments on a 12 processor SGI Origin 2000 shared memory system show good speedup and excellent scaleup results.