Multi-objective optimization for incremental decision tree learning

Authors:
Hang Yang;Simon Fong;Yain-Whar Si
Affiliations:
Department of Science and Technology, University of Macau, Macau, China;Department of Science and Technology, University of Macau, Macau, China;Department of Science and Technology, University of Macau, Macau, China
Venue:
DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Year:
2012

Citing 10
Cited 0

C4.5: programs for machine learning

C4.5: programs for machine learning
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Mining high-speed data streams

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Induction of Decision Trees

Machine Learning
Pruning Decision Trees with Misclassification Costs

ECML '98 Proceedings of the 10th European Conference on Machine Learning
The Biases of Decision Tree Pruning Strategies

IDA '99 Proceedings of the Third International Symposium on Advances in Intelligent Data Analysis
Accurate decision trees for mining high-speed data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
MOA: Massive Online Analysis

The Journal of Machine Learning Research
Moderated VFDT in stream mining using adaptive tie threshold and incremental pruning

DaWaK'11 Proceedings of the 13th international conference on Data warehousing and knowledge discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Decision tree learning can be roughly classified into two categories: static and incremental inductions. Static tree induction applies greedy search in splitting test for obtaining a global optimal model. Incremental tree induction constructs a decision model by analyzing data in short segments; during each segment a local optimal tree structure is formed. Very Fast Decision Tree [4] is a typical incremental tree induction based on the principle of Hoeffding bound for node-splitting test. But it does not work well under noisy data. In this paper, we propose a new incremental tree induction model called incrementally Optimized Very Fast Decision Tree (iOVFDT), which uses a multi-objective incremental optimization method. iOVFDT also integrates four classifiers at the leaf levels. The proposed incremental tree induction model is tested with a large volume of data streams contaminated with noise. Under such noisy data, we investigate how iOVFDT that represents incremental induction method working with local optimums compares to C4.5 which loads the whole dataset for building a globally optimal decision tree. Our experiment results show that iOVFDT is able to achieve similar though slightly lower accuracy, but the decision tree size and induction time are much smaller than that of C4.5.