Building multi-way decision trees with numerical attributes

Authors:
Fernando Berzal;Juan-Carlos Cubero;Nicolás Marín;Daniel Sánchez
Affiliations:
Department of Computer Science and Artificial Intelligence, University of Granada, 18071, Granada, Spain;Department of Computer Science and Artificial Intelligence, University of Granada, 18071, Granada, Spain;Department of Computer Science and Artificial Intelligence, University of Granada, 18071, Granada, Spain;Department of Computer Science and Artificial Intelligence, University of Granada, 18071, Granada, Spain
Venue:
Information Sciences: an International Journal
Year:
2004

Citing 21
Cited 10

A Distance-Based Attribute Selection Measure for Decision Tree Induction

Machine Learning
Clustering algorithms

Information retrieval
C4.5: programs for machine learning

C4.5: programs for machine learning
Very Simple Classification Rules Perform Well on Most Commonly Used Datasets

Machine Learning
Learning decision tree classifiers

ACM Computing Surveys (CSUR)
An Exact Probability Metric for Decision Tree Splitting and Stopping

Machine Learning
BOAT—optimistic decision tree construction

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Classification and regression: money *can* grow on trees

KDD '99 Tutorial notes of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: concepts and techniques

Data mining: concepts and techniques
Pattern Recognition and Image Processing

Pattern Recognition and Image Processing
RainForest—A Framework for Fast Decision Tree Construction of Large Datasets

Data Mining and Knowledge Discovery
PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning

Data Mining and Knowledge Discovery
Discretization: An Enabling Technique

Data Mining and Knowledge Discovery
Families of splitting criteria for classification trees

Statistics and Computing
Induction of Decision Trees

Machine Learning
On the quest for easy-to-understand splitting rules

Data & Knowledge Engineering
SLIQ: A Fast Scalable Classifier for Data Mining

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
SPRINT: A Scalable Parallel Classifier for Data Mining

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Scalable Classification over SQL Databases

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
ART: A Hybrid Classification Model

Machine Learning
Improved use of continuous attributes in C4.5

Journal of Artificial Intelligence Research

Relaxing instance boundaries for the search of splitting points of numerical attributes in classification trees

Information Sciences: an International Journal
Fuzzy ARTMAP dynamic decay adjustment: An improved fuzzy ARTMAP model with a conflict resolving facility

Applied Soft Computing
A discretization algorithm based on Class-Attribute Contingency Coefficient

Information Sciences: an International Journal
Feature-selection ability of the decision-tree algorithm and the impact of feature-selection/extraction on decision-tree results based on hyperspectral data

International Journal of Remote Sensing
Building a cost-constrained decision tree with multiple condition attributes

Information Sciences: an International Journal
A hierarchical model for test-cost-sensitive decision systems

Information Sciences: an International Journal
A discretization algorithm for uncertain data

DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part II
Multi-Test decision trees for gene expression data analysis

SIIS'11 Proceedings of the 2011 international conference on Security and Intelligent Information Systems
Texture based decision tree classification for Arecanut

Proceedings of the CUBE International Information Technology Conference
A vector-valued support vector machine model for multiclass problem

Information Sciences: an International Journal

Quantified Score

Hi-index	0.07

Visualization

Abstract

Decision trees are probably the most popular and commonly used classification model. They are recursively built following a top-down approach (from general concepts to particular examples) by repeated splits of the training dataset. When this dataset contains numerical attributes, binary splits are usually performed by choosing the threshold value which minimizes the impurity measure used as splitting criterion (e.g. C4.5 gain ratio criterion or CART Gini's index). In this paper we propose the use of multi-way splits for continuous attributes in order to reduce the tree complexity without decreasing classification accuracy. This can be done by intertwining a hierarchical clustering algorithm with the usual greedy decision tree learning.