Feature-selection ability of the decision-tree algorithm and the impact of feature-selection/extraction on decision-tree results based on hyperspectral data

Authors:
Y. Y. Wang;J. Li
Affiliations:
College of Resources Science, Beijing Normal University, Haidian District, Beijing, 100875, China;College of Resources Science, Beijing Normal University, Haidian District, Beijing, 100875, China
Venue:
International Journal of Remote Sensing
Year:
2008

Citing 9
Cited 1

C4.5: programs for machine learning

C4.5: programs for machine learning
Wrappers for feature subset selection

Artificial Intelligence - Special issue on relevance
Data mining and KDD: promise and challenges

Future Generation Computer Systems - Special double issue on data mining
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Feature Extraction Based on Decision Boundaries

IEEE Transactions on Pattern Analysis and Machine Intelligence
Induction of Decision Trees

Machine Learning
Comprehensible Interpretation of Relief's Estimates

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Building multi-way decision trees with numerical attributes

Information Sciences: an International Journal
Evolving Feature Selection

IEEE Intelligent Systems

Study of land cover classification based on knowledge rules using high-resolution remote sensing images

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.01

Visualization

Abstract

The decision-tree (DT) algorithm is a very popular and efficient data-mining technique. It is non-parametric and computationally fast. Besides forming interpretable classification rules, it can select features on its own. In this article, the feature-selection ability of DT and the impacts of feature-selection/extraction on DT with different training sample sizes were studied by using AVIRIS hyperspcetral data. DT was compared with three other feature-selection methods; the results indicated that DT was an unstable feature selector, and the number of features selected by DT was strongly related to the sample size. Trees derived with and without feature-selection/extraction were compared. It was demonstrated that the impacts of feature selection on DT were shown mainly as a significant increase in the number of tree nodes (14.13-23.81%) and moderate increase in tree accuracy (3.5-4.8%). Feature extraction, like Non-parametric Weighted Feature Extraction (NWFE) and Decision Boundary Feature Extraction (DBFE), could enhance tree accuracy more obviously (4.78-6.15%) and meanwhile a decrease in the number of tree nodes (6.89-16.81%). When the training sample size was small, feature-selection/extraction could increase the accuracy more dramatically (6.90-15.66%) without increasing tree nodes.