The Strength of Weak Learnability
Machine Learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Boosting a weak learning algorithm by majority
Information and Computation
Machine Learning
Boosting the margin: A new explanation for the effectiveness of voting methods
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Stochastic Attribute Selection Committees
AI '98 Selected papers from the 11th Australian Joint Conference on Artificial Intelligence on Advanced Topics in Artificial Intelligence
Learning probabilistic relational concept descriptions
Learning probabilistic relational concept descriptions
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Classifying Unseen Cases with Many Missing Values
PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
Hi-index | 0.00 |
Classifier learning is a key technique for KDD. Approaches to learning classifier committees, including Boosting, Bagging, Sasc, and SascB, have demonstrated great success in increasing the prediction accuracy of decision trees. Boosting and Bagging create different classifiers by modifying the distribution of the training set. Sasc adopts a different method. It generates committees by stochastic manipulation of the set of attributes considered at each node during tree induction, but keeping the distribution of the training set unchanged. SascB, a combination of Boosting and Sasc, has shown the ability to further increase, on average, the prediction accuracy of decision trees. It has been found that the performance of SascB and Boosting is more variable than that of Sasc, although SascB is more accurate than the others on average. In this paper, we present a novel method to reduce variability of SascB and Boosting, and further increase their average accuracy. It generates multiple committees by incorporating Bagging into SascB. As well as improving stability and average accuracy, the resulting method is amenable to parallel or distributed processing, while Boosting and SascB are not. This is an important characteristic for datamining in large datasets.