Better Decision Tree from Intelligent Instance Selection: A new instance selection method based on Genetic Algorithm for optimizing decision trees

  • Authors:
  • Shuning Wu

  • Affiliations:
  • -

  • Venue:
  • Better Decision Tree from Intelligent Instance Selection: A new instance selection method based on Genetic Algorithm for optimizing decision trees
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

This book describes theoretical and experimental studies of instance selection to improve data mining model. Data preparation is one of the most important and time consuming phases in knowledge discovery. Preparation tasks often determine the success of data mining engagements. The importance of instance selection is the primary focus because the size of current and future databases often exceeds the amount of data which current data mining algorithms can handle properly. Instance selection thus can be used to improve scalability of data mining algorithms as well as improve the quality of the data mining results. This book presents a new optimization-based approach for instance selection that uses a genetic algorithm to select a subset of instances to produce a simpler decision tree model with acceptable accuracy. The resultant trees are easier to comprehend and interpret by the decision maker and hence more useful in practice. Numerical results are obtained for several difficult test data sets indicating that GA-based instance selection can often reduce the size of the decision tree by an order of magnitude while still maintaining good prediction accuracy.