Neural networks and the bias/variance dilemma
Neural Computation
Machine Learning
Shape quantization and recognition with randomized trees
Neural Computation
Randomizing Outputs to Increase Prediction Accuracy
Machine Learning
Machine Learning
Machine Learning
Transforming classifier scores into accurate multiclass probability estimates
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Tree Induction for Probability-Based Ranking
Machine Learning
Is random model better? On its accuracy and efficiency
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
On the optimality of probability estimation by random decision trees
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Maximizing tree diversity by building complete-random decision trees
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
A general framework for accurate and fast regression by data summarization in random decision trees
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A Practical Differentially Private Random Decision Tree Classifier
Transactions on Data Privacy
Statistical cross-language Web content quality assessment
Knowledge-Based Systems
Combining supervised and unsupervised models via unconstrained probabilistic embedding
Information Sciences: an International Journal
Hi-index | 0.00 |
There has been increasing number of independently proposed randomization methods in different stages of decision tree construction to build multiple trees. Randomized decision tree methods have been reported to be significantly more accurate than widely-accepted single decision trees, although the training procedure of some methods incorporates a surprisingly random factor and therefore opposes the generally accepted idea of employing gain functions to choose optimum features at each node and compute a single tree that fits the data. One important question that is not well understood yet is the reason behind the high accuracy. We provide an insight based on posterior probability estimations. We first establish the relationship between effective posterior probability estimation and effective loss reduction. We argue that randomized decision tree methods effectively approximate the true probability distribution using the decision tree hypothesis space. We conduct experiments using both synthetic and real-world datasets under both 0-1 and cost-sensitive loss functions.