Distributed Data Mining Methodology with Classification Model Example

  • Authors:
  • Marcin Gorawski;Ewa Płuciennik-Psota

  • Affiliations:
  • Institute of Computer Science, Silesian University of Technology, Gliwice, Poland 44-100;Institute of Computer Science, Silesian University of Technology, Gliwice, Poland 44-100

  • Venue:
  • ICCCI '09 Proceedings of the 1st International Conference on Computational Collective Intelligence. Semantic Web, Social Networks and Multiagent Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distributed computing and data mining are two elements essential for many commercial and scientific organizations. Data mining is a time and hardware resources consuming process of building analytical models of data. Distribution is often a part of organizations' structure. Authors propose methodology of distributed data mining by combining local analytical models (build in parallel in nodes of a distributed computer system) into a global one without necessity to construct distributed version of data mining algorithm. Different combining strategies are proposed and their verification method as well. Proposed solutions were tested with data sets coming from UCI Machine Learning Repository.