COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Selective Sampling Using the Query by Committee Algorithm
Machine Learning
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
Toward Optimal Active Learning through Sampling Estimation of Error Reduction
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Query Learning with Large Margin Classifiers
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Support vector machine active learning with applications to text classification
The Journal of Machine Learning Research
Rademacher and gaussian complexities: risk bounds and structural results
The Journal of Machine Learning Research
Convergence of alternating optimization
Neural, Parallel & Scientific Computations
Active learning using pre-clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Batch mode active learning and its application to medical image classification
ICML '06 Proceedings of the 23rd international conference on Machine learning
Active learning via transductive experimental design
ICML '06 Proceedings of the 23rd international conference on Machine learning
Outlier detection by active learning
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
The Journal of Machine Learning Research
Factorization meets the neighborhood: a multifaceted collaborative filtering model
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Importance weighted active learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Active learning with statistical models
Journal of Artificial Intelligence Research
Representative sampling for text classification using support vector machines
ECIR'03 Proceedings of the 25th European conference on IR research
Hilbert Space Embeddings and Metrics on Probability Measures
The Journal of Machine Learning Research
Theoretical Computer Science
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Active learning with adaptive regularization
Pattern Recognition
Foundations and Trends® in Machine Learning
The Journal of Machine Learning Research
Batch mode active sampling based on marginal probability distribution matching
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
Empirical risk minimization (ERM) provides a useful guideline for many machine learning and data mining algorithms. Under the ERM principle, one minimizes an upper bound of the true risk, which is approximated by the summation of empirical risk and the complexity of the candidate classifier class. To guarantee a satisfactory learning performance, ERM requires that the training data are i.i.d. sampled from the unknown source distribution. However, this may not be the case in active learning, where one selects the most informative samples to label and these data may not follow the source distribution. In this paper, we generalize the empirical risk minimization principle to the active learning setting. We derive a novel form of upper bound for the true risk in the active learning setting; by minimizing this upper bound we develop a practical batch mode active learning method. The proposed formulation involves a non-convex integer programming optimization problem. We solve it efficiently by an alternating optimization method. Our method is shown to query the most informative samples while preserving the source distribution as much as possible, thus identifying the most uncertain and representative queries. Experiments on benchmark data sets and real-world applications demonstrate the superior performance of our proposed method in comparison with the state-of-the-art methods.