C4.5: programs for machine learning
C4.5: programs for machine learning
Democracy in neural nets: voting schemes for classification
Neural Networks
The application of AdaBoost for distributed, scalable and on-line learning
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Papyrus: a system for data mining over local and wide area clusters and super-clusters
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Reduction Techniques for Instance-BasedLearning Algorithms
Machine Learning
A component-based architecture for problem solving environments
Mathematics and Computers in Simulation - IMACS sponsored special issue: 1999 international symposium on computational sciences, to honor John R. Rice
The distributed boosting algorithm
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Model Combination in the Multiple-Data-Batches Scenario
ECML '97 Proceedings of the 9th European Conference on Machine Learning
Identifying Relevant Databases for Multidatabase Mining
PAKDD '98 Proceedings of the Second Pacific-Asia Conference on Research and Development in Knowledge Discovery and Data Mining
Clustering classifiers for knowledge discovery from physically distributed databases
Data & Knowledge Engineering
International Journal of Hybrid Intelligent Systems
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part III
Prototype selection algorithms for distributed learning
Pattern Recognition
Distributed learning with data reduction
Transactions on computational collective intelligence IV
KES-AMSTA'11 Proceedings of the 5th KES international conference on Agent and multi-agent systems: technologies and applications
Hi-index | 0.00 |
Distributed data mining is an important research area. The task of distributed data mining is to analyse data from different sources. Solving such tasks requires special approaches and tools, different from those dedicated to analysing data located in a single database. This paper presents an approach to learning classifiers from distributed data that is based on data reduction (the prototype selection) at the local level. In such case, the aim of data reduction is to obtain a compact representation of distributed data repositories that include non-redundant information in the form of so-called prototypes. The approach has been implemented using the JABAT environment, which, in turn, is an implementation of the A-Team concept. The paper includes a general overview of JABAT, the problem formulation and a description of the proposed solution in which the global classifier is induced from prototypes that are selected from distributed datasets in the process of data reduction at the local level. Finally, computational experiment results validating the approach are shown. The experiment results indicate that proposed classifier can produce very good classification results.