Training connectionist networks with queries and selective sampling
Advances in neural information processing systems 2
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Improving Generalization with Active Learning
Machine Learning - Special issue on structured connectionist systems
Instance Selection and Construction for Data Mining
Instance Selection and Construction for Data Mining
Machine Learning
Machine Learning
Toward Optimal Active Learning through Sampling Estimation of Error Reduction
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Less is More: Active Learning with Support Vector Machines
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Clustering documents into a web directory for bootstrapping a supervised classification
Data & Knowledge Engineering - Special issue: WIDM 2003
Large-scale text categorization by batch mode active learning
Proceedings of the 15th international conference on World Wide Web
ICML '06 Proceedings of the 23rd international conference on Machine learning
Confidence-Based Active Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence
A bound on the label complexity of agnostic active learning
Proceedings of the 24th international conference on Machine learning
Hierarchical sampling for active learning
Proceedings of the 25th international conference on Machine learning
Active learning with multiple views
Journal of Artificial Intelligence Research
Active learning with statistical models
Journal of Artificial Intelligence Research
Efficient Coverage of Case Space with Active Learning
EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Active learning in the non-realizable case
ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
Hi-index | 0.00 |
In some classification tasks, such as those related to the automatic building and maintenance of text corpora, it is expensive to obtain labeled examples to train a classifier. In such circumstances it is common to have massive corpora where a few examples are labeled (typically a minority) while others are not. Semi-supervised learning techniques try to leverage the intrinsic information in unlabeled examples to improve classification models. However, these techniques assume that the labeled examples cover all the classes to learn which might not stand. In the presence of an imbalanced class distribution getting labeled examples from minority classes might be very costly if queries are randomly selected. Active learning allows asking an oracle to label new examples, that are criteriously selected, and does not assume a previous knowledge of all classes. D-Confidence is an active learning approach that is effective when in presence of imbalanced training sets. In this paper we discuss the performance of d-Confidence over text corpora. We show empirically that d-Confidence reduces the number of queries required to identify examples from all classes to learn when compared to confidence, a common active learning criterion.