A GA-Based wrapper feature selection for animal breeding data mining

  • Authors:
  • Olgierd Unold;Maciej Dobrowolski;Henryk Maciejewski;Pawel Skrobanek;Ewa Walkowicz

  • Affiliations:
  • Institute of Computer Engineering, Control and Robotics, Wroclaw University of Technology, Wroclaw, Poland;Department of Horse Breeding and Riding, Wroclaw University of Environmental and Life Sciences, Wroclaw, Poland;Institute of Computer Engineering, Control and Robotics, Wroclaw University of Technology, Wroclaw, Poland;Institute of Computer Engineering, Control and Robotics, Wroclaw University of Technology, Wroclaw, Poland;Department of Horse Breeding and Riding, Wroclaw University of Environmental and Life Sciences, Wroclaw, Poland

  • Venue:
  • HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Feature selection methods are used to tackle the problem of the curse of the dimensionality of data to be mined. This applies also to the area of animal breeding, in which datasets collect remarkably a large number of animal features. In this paper, we have conducted a comprehensive study of both 12 classification methods as well as 12 GA-based feature selection methods for classification of the Silesian horse data. To assess the performance of the wrappers and the classification methods over the animal dataset we used two metrics: a probability metric Area under the ROC curve (AUC), and a rank metric Root Mean Square Error (RMSE). All of the classifiers and the wrappers were taken from the Weka machine learning software. We find that most of the GA-based wrappers achieved results no worse than high-dimensional dataset. The statistical results obtained make the three classifiers: a decision tree ADT, a logistic regression Log and a bagging method Bag competitive method to be considered in the field of animal breeding data mining.