Top-down decision tree learning as information based boosting

Authors:
Eiji Takimoto;Akira Maruoka
Affiliations:
Graduate School of Information Sciences, Tohoku University, 980-8579 Sendai, Japan;Graduate School of Information Sciences, Tohoku University, 980-8579 Sendai, Japan
Venue:
Theoretical Computer Science
Year:
2003

Citing 12
Cited 1

The Strength of Weak Learnability

Machine Learning
Machine learning: a theoretical approach

Machine learning: a theoretical approach
C4.5: programs for machine learning

C4.5: programs for machine learning
Boosting a weak learning algorithm by majority

Information and Computation
Game theory, on-line prediction and boosting

COLT '96 Proceedings of the ninth annual conference on Computational learning theory
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Improved boosting algorithms using confidence-rated predictions

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
On the boosting ability of top-down decision tree learning algorithms

Journal of Computer and System Sciences
Boosting the margin: A new explanation for the effectiveness of voting methods

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Mutual Information Gaining Algorithm and Its Relation to PAC-Learning Algorithm

AII '94 Proceedings of the 4th International Workshop on Analogical and Inductive Inference: Algorithmic Learning Theory
Improving Algorithms for Boosting

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Boosting Using Branching Programs

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory

Top-Down Decision Tree Boosting and Its Applications

Progress in Discovery Science, Final Report of the Japanese Discovery Science Project

Quantified Score

Hi-index	5.23

Visualization

Abstract

We consider a boosting technique that can be directly applied to multiclass classification problems. Although many boosting algorithms have been proposed so far, most of them are developed essentially for binary classification problems, and in order to handle multiclass classification problems, they need to be reduced somehow to binary ones. In order to avoid such reductions, we introduce a notion of the pseudo-entropy function G that gives an information-theoretic criterion, called the conditional G-entropy, for measuring the loss of hypotheses. The conditional G-entropy turns out to be useful for defining the weakness of hypotheses that approximate, in some way, a multiclass function in general, so that we can consider the boosting problem without reduction. We show that the top-down decision tree learning algorithm using the conditional G-entropy as its splitting criterion is an efficient boosting algorithm. Namely, the algorithm intends to minimize the conditional G-entropy, rather than the classification error. In the binary case, our algorithm turns out to be identical to the error-based boosting algorithm proposed by Kearns and Mansour, and our analysis gives a simpler proof of their results.