A new node splitting measure for decision tree construction

Authors:
B. Chandra;Ravi Kothari;Pallath Paul
Affiliations:
Department of Mathematics, Indian Institute of Technology, Delhi, India;IBM Research, New Delhi, India;Department of Mathematics, Indian Institute of Technology, Delhi, India
Venue:
Pattern Recognition
Year:
2010

Citing 12
Cited 6

On the Handling of Continuous-Valued Attributes in Decision Tree Generation

Machine Learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Technical note: some properties of splitting criteria

Machine Learning
A New Criterion in Selection and Discretization of Attributes for the Generation of Decision Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
On the well-behavedness of important attribute evaluation functions

SCAI '97 Proceedings of the sixth Scandinavian conference on Artificial intelligence
General and Efficient Multisplitting of Numerical Attributes

Machine Learning
Algorithms for Finding Attribute Value Group for Binary Segmentation of Categorical Databases

IEEE Transactions on Knowledge and Data Engineering
Induction of Decision Trees

Machine Learning
SLIQ: A Fast Scalable Classifier for Data Mining

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
An Improved Attribute Selection Measure for Decision Tree Induction

FSKD '07 Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 04
Improved use of continuous attributes in C4.5

Journal of Artificial Intelligence Research

A hyper-heuristic evolutionary algorithm for automatically designing decision-tree algorithms

Proceedings of the 14th annual conference on Genetic and evolutionary computation
A novel node splitting criteria for decision trees based on theil index

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part II
Decision trees: a recent overview

Artificial Intelligence Review
Software effort prediction: a hyper-heuristic decision-tree based approach

Proceedings of the 28th Annual ACM Symposium on Applied Computing
Automatic design of decision-tree algorithms with evolutionary algorithms

Evolutionary Computation
Building fast decision trees from large training sets

Intelligent Data Analysis

Quantified Score

Hi-index	0.02

Visualization

Abstract

A new node splitting measure termed as distinct class based splitting measure (DCSM) for decision tree induction giving importance to the number of distinct classes in a partition has been proposed in this paper. The measure is composed of the product of two terms. The first term deals with the number of distinct classes in each child partition. As the number of distinct classes in a partition increase, this first term increases and thus Purer partitions are thus preferred. The second term decreases when there are more examples of a class compared to the total number of examples in the partition. The combination thus still favors purer partition. It is shown that the DCSM satisfies two important properties that a split measure should possess viz. convexity and well-behavedness. Results obtained over several datasets indicate that decision trees induced based on the DCSM provide better classification accuracy and are more compact (have fewer nodes) than trees induced using two of the most popular node splitting measures presently in use.