C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
On the boosting ability of top-down decision tree learning algorithms
STOC '96 Proceedings of the twenty-eighth annual ACM symposium on Theory of computing
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Shape quantization and recognition with randomized trees
Neural Computation
Data-dependent structural risk minimisation for perceptron decision trees
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
BOAT—optimistic decision tree construction
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Learning and making decisions when costs and probabilities are both unknown
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Pruning Decision Trees with Misclassification Costs
ECML '98 Proceedings of the 10th European Conference on Machine Learning
SLIQ: A Fast Scalable Classifier for Data Mining
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
SPRINT: A Scalable Parallel Classifier for Data Mining
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
An extensible meta-learning approach for scalable and accurate inductive learning
An extensible meta-learning approach for scalable and accurate inductive learning
Systematic data selection to mine concept-drifting data streams
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Interruptible anytime algorithms for iterative improvement of decision trees
UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Learning through Changes: An Empirical Study of Dynamic Behaviors of Probability Estimation Trees
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
A general framework for accurate and fast regression by data summarization in random decision trees
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
StreamMiner: a classifier ensemble-based engine to mine concept-drifting data streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A semi-random multiple decision-tree algorithm for mining data streams
Journal of Computer Science and Technology
Parameter Estimation in Semi-Random Decision Tree Ensembling on Streaming Data
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
On the optimality of probability estimation by random decision trees
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Spectrum of variable-random trees
Journal of Artificial Intelligence Research
Ensemble pruning via individual contribution ordering
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Ozone day prediction with radial basis function networks
ICS'10 Proceedings of the 14th WSEAS international conference on Systems: part of the 14th WSEAS CSCC multiconference - Volume II
Random ensemble decision trees for learning concept-drifting data streams
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
When efficient model averaging out-performs boosting and bagging
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Variable randomness in decision tree ensembles
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Maximizing tree diversity by building complete-random decision trees
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Novel ensemble methods for regression via classification problems
Expert Systems with Applications: An International Journal
A Practical Differentially Private Random Decision Tree Classifier
Transactions on Data Privacy
Hi-index | 0.00 |
Inductive learning searches an optimal hypothesis thatminimizes a given loss function. It is usually assumed thatthe simplest hypothesis that fits the data is the best approximateto an optimal hypothesis. Since finding the simplesthypothesis is NP-hard for most representations, we generallyemploy various heuristics to search its closest match.Computing these heuristics incurs significant cost, makinglearning inefficient and unscalable for large dataset. In thesame time, it is still questionable if the simplest hypothesisis indeed the closest approximate to the optimal model.Recent success of combining multiple models, such as bagging,boosting and meta-learning, has greatly improved theaccuracy of the simplest hypothesis, providing a strong argumentagainst the optimality of the simplest hypothesis.However, computing these combined hypotheses incurs significantlyhigher cost. In this paper, we first advert that aslong as the error of a hypothesis on each example is withina range dictated by a given loss function, it can still be optimal.Contrary to common beliefs, we propose a completelyrandom decision tree algorithm that achieves much higheraccuracy than the single best hypothesis and is comparableto boosted or bagged multiple best hypotheses. The advantageof multiple random tree is its training efficiency aswell as minimal memory requirement.