An A-Team approach to learning classifiers from distributed data sources

Authors:
Ireneusz Czarnowski;Piotr Jedrzejowicz;Izabela Wierzbowska
Affiliations:
Department of Information Systems, Gdynia Maritime University, Morska 83, 81-225 Gdynia, Poland.;Department of Information Systems, Gdynia Maritime University, Morska 83, 81-225 Gdynia, Poland.;Department of Information Systems, Gdynia Maritime University, Morska 83, 81-225 Gdynia, Poland
Venue:
International Journal of Intelligent Information and Database Systems
Year:
2010

Citing 15
Cited 4

C4.5: programs for machine learning

C4.5: programs for machine learning
Democracy in neural nets: voting schemes for classification

Neural Networks
The application of AdaBoost for distributed, scalable and on-line learning

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Papyrus: a system for data mining over local and wide area clusters and super-clusters

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Reduction Techniques for Instance-BasedLearning Algorithms

Machine Learning
A component-based architecture for problem solving environments

Mathematics and Computers in Simulation - IMACS sponsored special issue: 1999 international symposium on computational sciences, to honor John R. Rice
The distributed boosting algorithm

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Model Combination in the Multiple-Data-Batches Scenario

ECML '97 Proceedings of the 9th European Conference on Machine Learning
Identifying Relevant Databases for Multidatabase Mining

PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
Clustering classifiers for knowledge discovery from physically distributed databases

Data & Knowledge Engineering
A Framework for Learning from Distributed Data Using Sufficient Statistics and Its Application to Learning Decision Trees

International Journal of Hybrid Intelligent Systems
On the combination of evolutionary algorithms and stratified strategies for training set selection in data mining

Applied Soft Computing
Bagging, boosting, and C4.S

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
JADE-Based a-team environment

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part III

Prototype selection algorithms for distributed learning

Pattern Recognition
Distributed learning with data reduction

Transactions on computational collective intelligence IV
Machine learning and agents

KES-AMSTA'11 Proceedings of the 5th KES international conference on Agent and multi-agent systems: technologies and applications
Towards enhancing centroid classifier for text classification-A border-instance approach

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Distributed data mining is an important research area. The task of distributed data mining is to analyse data from different sources. Solving such tasks requires special approaches and tools, different from those dedicated to analysing data located in a single database. This paper presents an approach to learning classifiers from distributed data that is based on data reduction (the prototype selection) at the local level. In such case, the aim of data reduction is to obtain a compact representation of distributed data repositories that include non-redundant information in the form of so-called prototypes. The approach has been implemented using the JABAT environment, which, in turn, is an implementation of the A-Team concept. The paper includes a general overview of JABAT, the problem formulation and a description of the proposed solution in which the global classifier is induced from prototypes that are selected from distributed datasets in the process of data reduction at the local level. Finally, computational experiment results validating the approach are shown. The experiment results indicate that proposed classifier can produce very good classification results.