Building multi-way decision trees with numerical attributes

  • Authors:
  • Fernando Berzal;Juan-Carlos Cubero;Nicolás Marín;Daniel Sánchez

  • Affiliations:
  • Department of Computer Science and Artificial Intelligence, University of Granada, 18071, Granada, Spain;Department of Computer Science and Artificial Intelligence, University of Granada, 18071, Granada, Spain;Department of Computer Science and Artificial Intelligence, University of Granada, 18071, Granada, Spain;Department of Computer Science and Artificial Intelligence, University of Granada, 18071, Granada, Spain

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2004

Quantified Score

Hi-index 0.07

Visualization

Abstract

Decision trees are probably the most popular and commonly used classification model. They are recursively built following a top-down approach (from general concepts to particular examples) by repeated splits of the training dataset. When this dataset contains numerical attributes, binary splits are usually performed by choosing the threshold value which minimizes the impurity measure used as splitting criterion (e.g. C4.5 gain ratio criterion or CART Gini's index). In this paper we propose the use of multi-way splits for continuous attributes in order to reduce the tree complexity without decreasing classification accuracy. This can be done by intertwining a hierarchical clustering algorithm with the usual greedy decision tree learning.