Decision trees with minimal costs

  • Authors:
  • Charles X. Ling;Qiang Yang;Jianning Wang;Shichao Zhang

  • Affiliations:
  • The University of Western Ontario, London, Ontario, Canada;Hong Kong University of Science and Technology, Kowloon, Hong Kong;The University of Western Ontario, London, Ontario, Canada;Guangxi Normal University, China

  • Venue:
  • ICML '04 Proceedings of the twenty-first international conference on Machine learning
  • Year:
  • 2004

Quantified Score

Hi-index 0.01

Visualization

Abstract

We propose a simple, novel and yet effective method for building and testing decision trees that minimizes the sum of the misclassification and test costs. More specifically, we first put forward an original and simple splitting criterion for attribute selection in tree building. Our tree-building algorithm has many desirable properties for a cost-sensitive learning system that must account for both types of costs. Then, assuming that the test cases may have a large number of missing values, we design several intelligent test strategies that can suggest ways of obtaining the missing values at a cost in order to minimize the total cost. We experimentally compare these strategies and C4.5, and demonstrate that our new algorithms significantly outperform C4.5 and its variations. In addition, our algorithm's complexity is similar to that of C4.5, and is much lower than that of previous work. Our work is useful for many diagnostic tasks which must factor in the misclassification and test costs for obtaining missing information.