Classification tree analysis using TARGET

Authors:
J. Brian Gray;Guangzhe Fan
Affiliations:
Department of Information Systems, Statistics and Management Science, The University of Alabama, Tuscaloosa, AL 35487-0226, USA;Department of Statistics and Actuarial Science, Center of Computational Mathematics for Industry and Commerce, University of Waterloo, Waterloo, Ont., Canada N2L 3G1
Venue:
Computational Statistics & Data Analysis
Year:
2008

Citing 6
Cited 10

Boosting a weak learning algorithm by majority

Information and Computation
Bagging predictors

Machine Learning
Genetic algorithms and their statistical applications: an introduction

Computational Statistics & Data Analysis
Genetic Algorithms in Search, Optimization and Machine Learning

Genetic Algorithms in Search, Optimization and Machine Learning
Random Forests

Machine Learning
Genetic algorithms for the identification of additive and innovation outliers in time series

Computational Statistics & Data Analysis

Evolutionary model tree induction

Proceedings of the 2010 ACM Symposium on Applied Computing
Forest classification trees and forest support vector machines algorithms: Demonstration using microarray data

Computers in Biology and Medicine
Enhancing the classification accuracy by scatter-search-based ensemble approach

Applied Soft Computing
A comparison of different rule-based statistical models for modeling geogenic groundwater contamination

Environmental Modelling & Software
Mining data with random forests: A survey and results of new tests

Pattern Recognition
Evolutionary model trees for handling continuous classes in machine learning

Information Sciences: an International Journal
Genetics-based machine learning for rule induction: state of the art, taxonomy, and comparative study

IEEE Transactions on Evolutionary Computation
Parallel hierarchical sampling: A general-purpose interacting Markov chains Monte Carlo algorithm

Computational Statistics & Data Analysis
Application of hybrid case-based reasoning for enhanced performance in bankruptcy prediction

Information Sciences: an International Journal
A novel approach for designing adaptive fuzzy classifiers based on the combination of an artificial immune network and a memetic algorithm

Information Sciences: an International Journal

Quantified Score

Hi-index	0.03

Visualization

Abstract

Tree models are valuable tools for predictive modeling and data mining. Traditional tree-growing methodologies such as CART are known to suffer from problems including greediness, instability, and bias in split rule selection. Alternative tree methods, including Bayesian CART (Chipman et al., 1998; Denison et al., 1998), random forests (Breiman, 2001a), bootstrap bumping (Tibshirani and Knight, 1999), QUEST (Loh and Shih, 1997), and CRUISE (Kim and Loh, 2001), have been proposed to resolve these issues from various aspects, but each has its own drawbacks. Gray and Fan (2003) described a genetic algorithm approach to constructing decision trees called tree analysis with randomly generated and evolved trees (TARGET) that performs a better search of the tree model space and largely resolves the problems with current tree modeling techniques. Utilizing the Bayesian information criterion (BIC), Fan and Gray (2005) developed a version of TARGET for regression tree analysis. In this article, we consider the construction of classification trees using TARGET. We modify the BIC to handle a categorical response variable, but we also adjust its penalty component to better account for the model complexity of TARGET. We also incorporate the option of splitting rules based on linear combinations of two or three variables in TARGET, which greatly improves the prediction accuracy of TARGET trees. Comparisons of TARGET to existing methods, using simulated and real data sets, indicate that TARGET has advantages over these other approaches.