Partitioning Nominal Attributes in Decision Trees

Authors:
Don Coppersmith;Se June Hong;Jonathan R. M. Hosking
Affiliations:
IBM Research Division, T.J. Watson Research Center, Yorktown Heights, NY 10598, USA. copper@watson.ibm.com;IBM Research Division, T.J. Watson Research Center, Yorktown Heights, NY 10598, USA. hong@watson.ibm.com;IBM Research Division, T.J. Watson Research Center, Yorktown Heights, NY 10598, USA. hosking@watson.ibm.com
Venue:
Data Mining and Knowledge Discovery
Year:
1999

Citing 5
Cited 9

Optimal Partitioning for Classification and Regression Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
C4.5: programs for machine learning

C4.5: programs for machine learning
SLIQ: A Fast Scalable Classifier for Data Mining

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Applications of information theory to pattern recognition and the design of decision trees and trellises

Applications of information theory to pattern recognition and the design of decision trees and trellises
An iterative 'flip-flop' approximation of the most informative split in the construction of decision trees

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference

Towards an effective cooperation of the user and the computer for classification

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Linear-Time Preprocessing in Optimal Numerical Range Partitioning

Journal of Intelligent Information Systems - Special issue: A survey of research questions for intelligent information systems in education
Generalized Entropy and Projection Clustering of Categorical Data

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Efficient Multisplitting Revisited: Optima-Preserving Elimination of Partition Candidates

Data Mining and Knowledge Discovery
JRV: an interactive tool for data mining visualization

ACM-SE 42 Proceedings of the 42nd annual Southeast regional conference
On the Computational Complexity of Optimal Multisplitting

Fundamenta Informaticae - Intelligent Systems
Data mining on multimedia data

Data mining on multimedia data
How to interpret decision trees?

ICDM'11 Proceedings of the 11th international conference on Advances in data mining: applications and theoretical aspects
On the Computational Complexity of Optimal Multisplitting

Fundamenta Informaticae - Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

To find the optimal branching of a nominal attribute at a node in anL-ary decision tree, one is often forced to search over all possibleL-ary partitions for the one that yields the minimum impurity measure.For binary trees (L = 2) when there are just two classes a short-cutsearch is possible that is linear in n, the number of distinct valuesof the attribute.For the general case in which the number of classes, k, may begreater than two, Burshtein et al. have shown that the optimalpartition satisfies a condition that involves the existence of({L\atop 2}) hyperplanes in the class probability space.We derive a property of the optimal partition for concave impurity measures (including in particularthe Gini and entropy impurity measures) in terms of the existence ofL vectors in the dualof the class probability space, which implies the earlier condition.Unfortunately, these insights still do not offer a practical searchmethod when n and k are large, even for binary trees.We therefore present a new heuristic search algorithm tofind a good partition. It is based on ordering the attribute‘svalues according to their principal component scores in the classprobability space, and is linear in n. We demonstrate theeffectiveness of the new method through Monte Carlo simulationexperiments and compare its performance against other heuristic methods.