Parallel Preprocessing for Classification Problems in Knowledge Discovery Systems

  • Authors:
  • N. R. Akchurina;V. N. Vagin

  • Affiliations:
  • International Graduate School of Dynamic Intelligent Systems, Paderborn, Germany;Moscow Power Engineering Institute, Krasnokazarmennaya 14, 111250, Moscow, Russia

  • Venue:
  • Proceedings of the 2006 conference on Knowledge-Based Software Engineering: Proceedings of the Seventh Joint Conference on Knowledge-Based Software Engineering
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

A new algorithm for solving the actual problem in machine learning of joint preprocessing of qualitative and quantitative attributes with missing values is proposed. A parallel version of the algorithm developed by the authors is also presented. In thorough tests on 55 databases from the UC Irvine Repository specially designed from real databases of various fields for testing and comparing generalization algorithms, usage of the proposed algorithms has allowed to increase the classification accuracy (the main criterion of learning process) of the well-known classification algorithms: ID3, C4.5, Naïve Bayes, table majority, instance based algorithm almost in all the cases. In case of resources being available the parallel version of the algorithm allows to speed up preprocessing efficiently.