BOAI: fast alternating decision tree induction based on bottom-up evaluation

Authors:
Bishan Yang;Tengjiao Wang;Dongqing Yang;Lei Chang
Affiliations:
Key Laboratory of High Confidence Software Technologies, Ministry of Education, China, School of Electronics Engineering and Computer Science, Peking University, Beijing, China;Key Laboratory of High Confidence Software Technologies, Ministry of Education, China, School of Electronics Engineering and Computer Science, Peking University, Beijing, China;Key Laboratory of High Confidence Software Technologies, Ministry of Education, China, School of Electronics Engineering and Computer Science, Peking University, Beijing, China;Key Laboratory of High Confidence Software Technologies, Ministry of Education, China, School of Electronics Engineering and Computer Science, Peking University, Beijing, China
Venue:
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Year:
2008

Citing 10
Cited 2

C4.5: programs for machine learning

C4.5: programs for machine learning
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
BOAT—optimistic decision tree construction

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Random Forests

Machine Learning
SLIQ: A Fast Scalable Classifier for Data Mining

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
The Alternating Decision Tree Learning Algorithm

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
RainForest - A Framework for Fast Decision Tree Construction of Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
SPRINT: A Scalable Parallel Classifier for Data Mining

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Multivariate decision trees using different splitting attribute subsets for large datasets

AI'10 Proceedings of the 23rd Canadian conference on Advances in Artificial Intelligence
Building fast decision trees from large training sets

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Alternating Decision Tree (ADTree) is a successful classification model based on boosting and has a wide range of applications. The existing ADTree induction algorithms apply a "top-down" strategy to evaluate the best split at each boosting iteration, which is very time-consuming and thus is unsuitable for modeling on large data sets. This paper proposes a fast ADTree induction algorithm (BOAI) based on "bottom-up" evaluation, which offers high performance on massive data without sacrificing classification accuracy. BOAI uses a pre-sorting technique and dynamically evaluates splits by a bottom-up approach based on VW-group. With these techniques, huge redundancy in sorting and computation can be eliminated in the tree induction procedure. Experimental results on both real and synthetic data sets show that BOAI outperforms the best existing ADTree induction algorithm by a significant margin. In the real case study, BOAI also provides better performance than TreeNet and Random Forests, which are considered as efficient classification models.