An incremental method for finding multivariate splits for decision trees
Proceedings of the seventh international conference (1990) on Machine learning
C4.5: programs for machine learning
C4.5: programs for machine learning
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Decision Tree Induction Based on Efficient Tree Restructuring
Machine Learning
BOAT—optimistic decision tree construction
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining time-changing data streams
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Machine Learning
RainForest—A Framework for Fast Decision Tree Construction of Large Datasets
Data Mining and Knowledge Discovery
Incremental Induction of Decision Trees
Machine Learning
Database Mining: A Performance Perspective
IEEE Transactions on Knowledge and Data Engineering
SLIQ: A Fast Scalable Classifier for Data Mining
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
The Alternating Decision Tree Learning Algorithm
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
SPRINT: A Scalable Parallel Classifier for Data Mining
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Linear Machine Decision Trees
Efficient decision tree construction on streaming data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Combining multiple class distribution modified subsamples in a single tree
Pattern Recognition Letters
Decision trees using model ensemble-based nodes
Pattern Recognition
A New Incremental Algorithm for Induction of Multivariate Decision Trees for Large Datasets
IDEAL '08 Proceedings of the 9th International Conference on Intelligent Data Engineering and Automated Learning
Induction of multiclass multifeature split decision trees from distributed data
Pattern Recognition
A Streaming Parallel Decision Tree Algorithm
The Journal of Machine Learning Research
FAW'07 Proceedings of the 1st annual international conference on Frontiers in algorithmics
BOAI: fast alternating decision tree induction based on bottom-up evaluation
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
A new node splitting measure for decision tree construction
Pattern Recognition
Model trees for classification of hybrid data types
IDEAL'05 Proceedings of the 6th international conference on Intelligent Data Engineering and Automated Learning
Multivariate decision trees using different splitting attribute subsets for large datasets
AI'10 Proceedings of the 23rd Canadian conference on Advances in Artificial Intelligence
Top-down induction of decision trees classifiers - a survey
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Fuzzy decision trees: issues and methods
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Hi-index | 0.00 |
Decision trees are commonly used in supervised classification. Currently, supervised classification problems with large training sets are very common, however many supervised classifiers cannot handle this amount of data. There are some decision tree induction algorithms that are capable to process large training sets, however almost all of them have memory restrictions because they need to keep in main memory the whole training set, or a big amount of it. Moreover, algorithms that do not have memory restrictions have to choose a subset of the training set, needing extra time for this selection; or they require to specify the values for some parameters that could be very difficult to determine by the user. In this paper, we present a new fast heuristic for building decision trees from large training sets, which overcomes some of the restrictions of the state of the art algorithms, using all the instances of the training set without storing all of them in main memory. Experimental results show that our algorithm is faster than the most recent algorithms for building decision trees from large training sets.