Distributed learning with data reduction

  • Authors:
  • Ireneusz Czarnowski

  • Affiliations:
  • Department of Information Systems, Gdynia Maritime University, Gdynia, Poland

  • Venue:
  • Transactions on computational collective intelligence IV
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The work deals with the distributed machine learning. Distributed learning from data is considered to be an important challenge faced by researchers and practice in the domain of the distributed data mining and distributed knowledge discovery from databases. Currently, learning from data is recognized as one of the most widely investigated paradigms of machine learning. At the same time it is perceived as a difficult and demanding computational problem. Even more complex and still to a large extent open is learning from the distributed data. One of the approaches suitable for learning from the geographically distributed data is to select from the local databases relevant local patterns, called also prototypes. Such prototypes are selected using some specialized data reduction methods. The dissertation contains an overview of the problem of learning classifiers from data, followed by a discussion of the distributed learning. The above includes the problem formulation and the state-of-the-art review. Next, data reduction, approaches, techniques and algorithms are discussed. The central part of the dissertation proposes an agent-based distributed learning framework. The idea is to carry-out data reduction in parallel in separate locations, employing specialized software agents. The process ends when locally selected prototypes are moved to a central site and merged into the global knowledge model. The following part of the work contains the results of an extensive computational experiment aiming at validation of the proposed approach. Finally, conclusions and suggestions for further research are formulated.