Data mining tasks and methods: Classification: decision-tree discovery

Authors:
Ronny Kohavi;J. Ross Quinlan
Affiliations:
Senior Director of Data Mining Applications, Blue Martini Software, San Mateo, California;Executive Director, RuleQuest Research Party Limited, Sydney, Australia
Venue:
Handbook of data mining and knowledge discovery
Year:
2002

Citing 24
Cited 9

Inductive knowledge acquisition: a case study

Proceedings of the Second Australian Conference on Applications of expert systems
Inferring decision trees using the minimum description length principle

Information and Computation
The Strength of Weak Learnability

Machine Learning
Neural networks and the bias/variance dilemma

Neural Computation
C4.5: programs for machine learning

C4.5: programs for machine learning
Coding Decision Trees

Machine Learning
Bagging predictors

Machine Learning
Decision Tree Induction Based on Efficient Tree Restructuring

Machine Learning
MetaCost: a general method for making classifiers cost-sensitive

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Using Model Trees for Classification

Machine Learning
Mining Very Large Databases with Parallel Processing

Mining Very Large Databases with Parallel Processing
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Induction of Decision Trees

Machine Learning
Simplifying Decision Trees by Pruning and Grafting: New Results (Extended Abstract)

ECML '95 Proceedings of the 8th European Conference on Machine Learning
The Effects of Training Set Size on Decision Tree Complexity

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Fast, Bottom-Up Decision Tree Pruning Algorithm with Near-Optimal Generalization

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
SPRINT: A Scalable Parallel Classifier for Data Mining

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A decision-theoretic generalization of on-line learning and an application to boosting

EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Option Decision Trees with Majority Votes

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Lookahead and pathology in decision tree induction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
On biases in estimating multi-valued attributes

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Oblivious decision trees graphs and top down pruning

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Bagging, boosting, and C4.S

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification

ACM SIGCOMM Computer Communication Review
An Integrated Approach for Modeling Learning Patterns of Students in Web-Based Instruction: A Cognitive Style Perspective

ACM Transactions on Computer-Human Interaction (TOCHI)
Using association rules to discover color-emotion relationships based on social tagging

KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part I
Bloat free genetic programming versus classification trees for identification of burned areas in satellite imagery

EvoApplicatons'10 Proceedings of the 2010 international conference on Applications of Evolutionary Computation - Volume Part I
EXPLORE: a novel decision tree classification algorithm

BNCOD'10 Proceedings of the 27th British national conference on Data Security and Security Data
Detecting social spam campaigns on twitter

ACNS'12 Proceedings of the 10th international conference on Applied Cryptography and Network Security
Timely and continuous machine-learning-based classification for interactive IP traffic

IEEE/ACM Transactions on Networking (TON)
Blog or block: Detecting blog bots through behavioral biometrics

Computer Networks: The International Journal of Computer and Telecommunications Networking
Using classification algorithms for predicting durum wheat yield in the province of Buenos Aires

Computers and Electronics in Agriculture

Quantified Score

Hi-index	0.01

Visualization

Abstract

We describe the two most commonly used systems for induction of decision trees for classification: C4.5 and CART. We highlight the methods and different decisions made in each system with respect to splitting criteria, pruning, noise handling, and other differentiating features. We describe how rules can be derived from decision trees and point to some differences in the induction of regression trees. We conclude with some pointers to advanced techniques, including ensemble methods, oblique splits, grafting, and coping with large data sets.