COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Active Learning Using Arbitrary Binary Valued Queries
Machine Learning
Improving Generalization with Active Learning
Machine Learning - Special issue on structured connectionist systems
The nature of statistical learning theory
The nature of statistical learning theory
Selective Sampling Using the Query by Committee Algorithm
Machine Learning
Active learning using adaptive resampling
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A Theory of Learning and Generalization: With Applications to Neural Networks and Control Systems
A Theory of Learning and Generalization: With Applications to Neural Networks and Control Systems
Machine Learning
Machine Learning
Toward Optimal Active Learning through Sampling Estimation of Error Reduction
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Less is More: Active Learning with Support Vector Machines
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Selective Sampling for Nearest Neighbor Classifiers
Machine Learning
Batch mode active learning and its application to medical image classification
ICML '06 Proceedings of the 23rd international conference on Machine learning
Semisupervised SVM batch mode active learning with applications to image retrieval
ACM Transactions on Information Systems (TOIS)
Active learning with statistical models
Journal of Artificial Intelligence Research
COLT'07 Proceedings of the 20th annual conference on Learning theory
Analysis of perceptron-based active learning
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Hi-index | 0.00 |
Active learning [1] is a branch of Machine Learning in which the learning algorithm, instead of being directly provided with pairs of problem instances and their solutions (their labels), is allowed to choose, from a set of unlabeled data, which instances to query. It is suited to settings where labeling instances is costly. This paper analyzes the speed-up of batch (parallel) active learning compared to sequential active learning (where instances are chosen 1 by 1): how faster can an algorithm become if it can query λ instances at once?. There are two main contributions: proving lower and upper bounds on the possible gain, and illustrating them by experimenting on usual active learning algorithms. Roughly speaking, the speed-up is asymptotically logarithmic in the batch size λ (i.e. when λ → ∞). However, for some classes of functions with finite VC-dimension V, a linear speed-up can be achieved until a batch size of V. Practically speaking, this means that parallelizing computations on an expensive-to-label problem which is suited to active learning is very beneficial until V simultaneous queries, and less interesting (yet still bringing improvement) afterwards.