A comprehensively sized decision tree generation method for interactive data mining of very large databases

Authors:
Hyontai Sug
Affiliations:
Division of Internet Engineering, Dongeo University, Busan, South Korea
Venue:
ADMA'05 Proceedings of the First international conference on Advanced Data Mining and Applications
Year:
2005

Citing 13
Cited 2

C4.5: programs for machine learning

C4.5: programs for machine learning
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Multi-level organization and summarization of the discovered rules

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Growing decision trees on support-less association rules

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Feature Extraction, Construction and Selection: A Data Mining Perspective

Feature Extraction, Construction and Selection: A Data Mining Perspective
Overcoming the Myopia of Inductive Learning Algorithms with RELIEFF

Applied Intelligence
PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning

Data Mining and Knowledge Discovery
Discretization: An Enabling Technique

Data Mining and Knowledge Discovery
SLIQ: A Fast Scalable Classifier for Data Mining

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
RainForest - A Framework for Fast Decision Tree Construction of Large Datasets

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
SPRINT: A Scalable Parallel Classifier for Data Mining

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
ART: A Hybrid Classification Model

Machine Learning

Decision trees: a recent overview

Artificial Intelligence Review
A hybrid decision tree classifier

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology

Quantified Score

Hi-index	0.01

Visualization

Abstract

For interactive data mining of very large databases a method working with relatively small training data that can be extracted from the target databases by sampling is proposed, because it takes very long time to generate decision trees for the data mining of very large databases that contain many continues data values, and size of decision trees has the tendency of dependency on the size of training data. The method proposes to use samples of confidence in proper size as the training data to generate comprehensible trees as well as to save time. For medium or small databases direct use of original data with some harsh pruning may be used, because the pruning generates trees of similar size with smaller error rates.