COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
A sequential algorithm for training text classifiers
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Improving Generalization with Active Learning
Machine Learning - Special issue on structured connectionist systems
Selective Sampling Using the Query by Committee Algorithm
Machine Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence
Toward Optimal Active Learning through Sampling Estimation of Error Reduction
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Query Learning Strategies Using Boosting and Bagging
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Less is More: Active Learning with Support Vector Machines
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Query Learning with Large Margin Classifiers
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Support vector machine active learning with applications to text classification
The Journal of Machine Learning Research
Active learning using pre-clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Diverse ensembles for active learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Active learning for logistic regression
Active learning for logistic regression
Active learning with statistical models
Journal of Artificial Intelligence Research
Graph-Based Active Learning Based on Label Propagation
MDAI '08 Sabadell Proceedings of the 5th International Conference on Modeling Decisions for Artificial Intelligence
Hi-index | 0.00 |
By selecting and asking the user to label only the most informative instances, active learners can significantly reduce the number of labeled training instances to learn a classification function. We focus here on how to select the most informative instances for labeling. In this paper we make three contributions. First, in contrast to the leading sampling strategy of halving the volume of version space, we present the sampling strategy of reducing the volume of version space by more than half with the assumption of target function being chosen from nonuniform distribution over version space. Second, via Halving model, we propose the idea of sampling the instances that would be most possibly misclassified. Third, we present a sampling method named CBMPMS (Committee Based Most Possible Misclassification Sampling) which samples the instances that have the largest probability to be misclassified by the current classifier. Comparing the proposed CBMPMS method with the existing active learning methods, when the classifiers achieve the same accuracy, the former method will sample fewer times than the latter ones. The experiments show that the proposed method outperforms the traditional sampling methods on most selected datasets.