A heuristic for learning decision trees and pruning them into classification rules

  • Authors:
  • José Ranilla;Oscar Luaces;Antonio Bahamonde

  • Affiliations:
  • Artificial Intelligence Center, University of Oviedo at Gijón, E-33271 Gijón, Spain E-mail: (ranilla,oluaces,antonio)@aic.uniovi.es;Artificial Intelligence Center, University of Oviedo at Gijón, E-33271 Gijón, Spain E-mail: (ranilla,oluaces,antonio)@aic.uniovi.es;Artificial Intelligence Center, University of Oviedo at Gijón, E-33271 Gijón, Spain E-mail: (ranilla,oluaces,antonio)@aic.uniovi.es

  • Venue:
  • AI Communications
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Let us consider a set of training examples described by continuous or symbolic attributes with categorical classes. In this paper we present a measure of the potential quality of a region of the attribute space to be represented as a rule condition to classify unseen cases. The aim is to take into account the distribution of the classes of the examples. The resulting measure, called impurity level, is inspired by a similar measure used in the instance-based algorithm IB3 for selecting suitable paradigmatic exemplars that will classify, in a nearest-neighbor context, future cases. The features of the impurity level are illustrated using a version of Quinlan's well-known C4.5 where the information-based heuristics are replaced by our measure. The experiments carried out to test the proposals indicate a very high accuracy reached with sets of classification rules as small as those found by RIPPER.