Distributed data mining: why do more than aggregating models

Authors:
Mohamed Aoun-Allah;Guy Mineau
Affiliations:
Computer Science and Software Engineering Department, Laval University, Quebec City, Canada;Computer Science and Software Engineering Department, Laval University, Quebec City, Canada
Venue:
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Year:
2007

Citing 10
Cited 2

The Strength of Weak Learnability

Machine Learning
Learning decision lists using homogeneous rules

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Bagging predictors

Machine Learning
Automating the analysis and cataloging of sky surveys

Advances in knowledge discovery and data mining
Probabilistic Knowledge Bases

IEEE Transactions on Knowledge and Data Engineering
An extensible meta-learning approach for scalable and accurate inductive learning

An extensible meta-learning approach for scalable and accurate inductive learning
Tutorial on Practical Prediction Theory for Classification

The Journal of Machine Learning Research
OPUS: an efficient admissible algorithm for unordered search

Journal of Artificial Intelligence Research
Improved use of continuous attributes in C4.5

Journal of Artificial Intelligence Research
Scaling up: distributed machine learning with cooperation

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Rule validation of a meta-classifier through a Galois (concept) lattice and complementary means

CLA'06 Proceedings of the 4th international conference on Concept lattices and their applications
CLAP: Collaborative pattern mining for distributed information systems

Decision Support Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we deal with the problem of mining large distributed databases. We show that the aggregation of models, i.e., sets of disjoint classification rules, each built over a subdatabase is quite enough to get an aggregated model that is both predictive and descriptive, that presents excellent prediction capability and that is conceptually much simpler than the comparable techniques. These results are made possible by lifting the disjoint cover constraint on the aggregated model and by the use of a confidence coefficient associated with each rule in a weighted majority vote.