A new information measure based on example-dependent misclassification costs and its application in decision tree learning

Authors:
Fritz Wysotzki;Peter Geibel
Affiliations:
Faculty of Electrical Engineering and Computer Science, University of Technology Berlin, Berlin, Germany;Faculty of Electrical Engineering and Computer Science, University of Technology Berlin, Berlin, Germany
Venue:
Advances in Artificial Intelligence
Year:
2009

Citing 11
Cited 1

C4.5: programs for machine learning

C4.5: programs for machine learning
Machine learning, neural and statistical classification

Machine learning, neural and statistical classification
Making large-scale support vector machine learning practical

Advances in kernel methods
Learning and making decisions when costs and probabilities are both unknown

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Support Vector Machines for Classification in Nonstandard Situations

Machine Learning
Bootstrap Methods for the Cost-Sensitive Evaluation of Classifiers

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning Perceptrons and Piecewise Linear Classifiers Sensitive to Example Dependent Costs

Applied Intelligence
Decision trees with minimal costs

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Perceptron and SVM learning with generalized cost models

Intelligent Data Analysis
The foundations of cost-sensitive learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2

Algorithm portfolios based on cost-sensitive hierarchical clustering

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article describes how the costs of misclassification given with the individual training objects for classification learning can be used in the construction of decision trees for minimal cost instead of minimal error class decisions. This is demonstrated by defining modified, cost-dependent probabilities, a new, cost-dependent information measure, and using a cost-sensitive extension of the CAL5 algorithm for learning decision trees. The cost-dependent information measure ensures the selection of the (local) next best, that is, cost-minimizing, discriminating attribute in the sequential construction of the classification trees. This is shown to be a cost-dependent generalization of the classical information measure introduced by Shannon, which only depends on classical probabilities. It is therefore of general importance and extends classic information theory, knowledge processing, and cognitive science, since subjective evaluations of decision alternatives can be included in entropy and the transferred information. Decision trees can then be viewed as cost-minimizing decoders for class symbols emitted by a source and coded by feature vectors. Experiments with two artificial datasets and one application example show that this approach is more accurate than a method which uses class dependent costs given by experts a priori.