C4.5: programs for machine learning
C4.5: programs for machine learning
Applications of machine learning and rule induction
Communications of the ACM
Advances in knowledge discovery and data mining
Advances in knowledge discovery and data mining
Data preparation for data mining
Data preparation for data mining
ECML '95 Proceedings of the 8th European Conference on Machine Learning
A study of distance-based machine learning algorithms
A study of distance-based machine learning algorithms
Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management
Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management
Hi-index | 0.00 |
A new algorithm for solving the actual problem in machine learning of joint preprocessing of qualitative and quantitative attributes with missing values is proposed. A parallel version of the algorithm developed by the authors is also presented. In thorough tests on 55 databases from the UC Irvine Repository specially designed from real databases of various fields for testing and comparing generalization algorithms, usage of the proposed algorithms has allowed to increase the classification accuracy (the main criterion of learning process) of the well-known classification algorithms: ID3, C4.5, Naïve Bayes, table majority, instance based algorithm almost in all the cases. In case of resources being available the parallel version of the algorithm allows to speed up preprocessing efficiently.