Distributed Data Mining Methodology with Classification Model Example

Authors:
Marcin Gorawski;Ewa Płuciennik-Psota
Affiliations:
Institute of Computer Science, Silesian University of Technology, Gliwice, Poland 44-100;Institute of Computer Science, Silesian University of Technology, Gliwice, Poland 44-100
Venue:
ICCCI '09 Proceedings of the 1st International Conference on Computational Collective Intelligence. Semantic Web, Social Networks and Multiagent Systems
Year:
2009

Citing 10
Cited 1

SQLEM: fast clustering in SQL using the EM algorithm

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
SQL database primitives for decision tree classifiers

Proceedings of the tenth international conference on Information and knowledge management
An Extension to SQL for Mining Association Rules

Data Mining and Knowledge Discovery
MSQL: A Query Language for Database Mining

Data Mining and Knowledge Discovery
Induction of Decision Trees

Machine Learning
A New SQL-like Operator for Mining Association Rules

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
KDDML: a middleware language and system for knowledge discovery in databases

Data & Knowledge Engineering
A Framework for Learning from Distributed Data Using Sufficient Statistics and Its Application to Learning Decision Trees

International Journal of Hybrid Intelligent Systems
Distributed Data Mining by Means of SQL Enhancement

OTM '08 Proceedings of the OTM Confederated International Workshops and Posters on On the Move to Meaningful Internet Systems: 2008 Workshops: ADI, AWeSoMe, COMBEK, EI2N, IWSSA, MONET, OnToContent + QSI, ORM, PerSys, RDDS, SEMELS, and SWWS
SQL-like language for database mining

ADBIS'97 Proceedings of the First East-European conference on Advances in Databases and Information systems

Distributed data mining methodology for clustering and classification model

ICAISC'10 Proceedings of the 10th international conference on Artificial intelligence and soft computing: Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

Distributed computing and data mining are two elements essential for many commercial and scientific organizations. Data mining is a time and hardware resources consuming process of building analytical models of data. Distribution is often a part of organizations' structure. Authors propose methodology of distributed data mining by combining local analytical models (build in parallel in nodes of a distributed computer system) into a global one without necessity to construct distributed version of data mining algorithm. Different combining strategies are proposed and their verification method as well. Proposed solutions were tested with data sets coming from UCI Machine Learning Repository.