Distributed data mining: why do more than aggregating models

  • Authors:
  • Mohamed Aoun-Allah;Guy Mineau

  • Affiliations:
  • Computer Science and Software Engineering Department, Laval University, Quebec City, Canada;Computer Science and Software Engineering Department, Laval University, Quebec City, Canada

  • Venue:
  • IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we deal with the problem of mining large distributed databases. We show that the aggregation of models, i.e., sets of disjoint classification rules, each built over a subdatabase is quite enough to get an aggregated model that is both predictive and descriptive, that presents excellent prediction capability and that is conceptually much simpler than the comparable techniques. These results are made possible by lifting the disjoint cover constraint on the aggregated model and by the use of a confidence coefficient associated with each rule in a weighted majority vote.