Active learning with misclassification sampling using diverse ensembles enhanced by unlabeled instances

Authors:
Jun Long;Jianping Yin;En Zhu;Wentao Zhao
Affiliations:
National University of Defense Technology, Changsha, Hunan, China;National University of Defense Technology, Changsha, Hunan, China;National University of Defense Technology, Changsha, Hunan, China;National University of Defense Technology, Changsha, Hunan, China
Venue:
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Year:
2008

Citing 10
Cited 0

Query by committee

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
A sequential algorithm for training text classifiers

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Selective Sampling Using the Query by Committee Algorithm

Machine Learning
Neural Network Ensembles

IEEE Transactions on Pattern Analysis and Machine Intelligence
Toward Optimal Active Learning through Sampling Estimation of Error Reduction

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Query Learning Strategies Using Boosting and Bagging

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Query Learning with Large Margin Classifiers

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Diverse ensembles for active learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Active learning with multiple views

Journal of Artificial Intelligence Research
Constructing diverse classifier ensembles using artificial training examples

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Active learners can significantly reduce the number of labeled training instances to learn a classification function by actively selecting only the most informative instances for labeling. Most existing methods try to select the instances which could halve the version space size after each sampling. In contrast to them, we try to reduce the volume of the version space more than half. Therefore, a sampling criterion of misclassification is presented. Furthermore, in each iteration of active learning, a strong classifier was introduced to estimate the target function for evaluation of the misclassification degree of an instance. We use a modified popular ensemble learning method DECORATE as the strong classifier which was enhanced by the unlabeled instances with high certainty by the current base classifier. The experiments show that the proposed method outperforms the traditional sampling methods on most selected datasets.