Distributed subgroup mining

Authors:
Michael Wurst;Martin Scholz
Affiliations:
Artificial Intelligence Group, University of Dortmund, Germany;Artificial Intelligence Group, University of Dortmund, Germany
Venue:
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Year:
2006

Citing 16
Cited 2

Explora: a multipattern and multistrategy discovery assistant

Advances in knowledge discovery and data mining
Communication-efficient distributed mining of association rules

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
A fast distributed algorithm for mining association rules

DIS '96 Proceedings of the fourth international conference on on Parallel and distributed information systems
Boosting Algorithms for Parallel and Distributed Learning

Distributed and Parallel Databases - Special issue: Parallel and distributed data mining
Parallel and Distributed Association Mining: A Survey

IEEE Concurrency
Parallel Mining of Association Rules

IEEE Transactions on Knowledge and Data Engineering
An Algorithm for Multi-relational Discovery of Subgroups

PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Rule Evaluation Measures: A Unifying View

ILP '99 Proceedings of the 9th International Workshop on Inductive Logic Programming
Finding the most interesting patterns in a database quickly by using sequential sampling

The Journal of Machine Learning Research
Decision Support Through Subgroup Discovery: Three Case Studies and the Lessons Learned

Machine Learning
ROC `n' Rule Learning—Towards a Better Understanding of Covering Algorithms

Machine Learning
Sampling-based sequential subgroup mining

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
On the Tractability of Rule Discovery from Distributed Data

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Exploiting background knowledge for knowledge-intensive subgroup discovery

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Parallel and distributed methods for incremental frequent itemset mining

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Secure top-k subgroup discovery

PSDML'10 Proceedings of the international ECML/PKDD conference on Privacy and security issues in data mining and machine learning
Secure Distributed Subgroup Discovery in Horizontally Partitioned Data

Transactions on Data Privacy

Quantified Score

Hi-index	0.00

Visualization

Abstract

Subgroup discovery is a popular form of supervised rule learning, applicable to descriptive and predictive tasks. In this work we study two natural extensions of classical subgroup discovery to distributed settings. In the first variant the goal is to efficiently identify global subgroups, i.e. the rules an analysis would yield after collecting all the data at a single central database. In contrast, the second considered variant takes the locality of data explicitly into account. The aim is to find patterns that point out major differences between individual databases with respect to a specific property of interest (target attribute). We point out substantial differences between these novel learning problems and other kinds of distributed data mining tasks. These differences motivate new search and communication strategies, aiming at a minimization of computation time and communication costs. We present and empirically evaluate new algorithms for both considered variants.