Improving importance estimation in pool-based batch active learning for approximate linear regression

Authors:
Nozomi Kurihara;Masashi Sugiyama
Affiliations:
-;-
Venue:
Neural Networks
Year:
2012

Citing 12
Cited 0

A sequential algorithm for training text classifiers

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Employing EM and Pool-Based Active Learning for Text Classification

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Incremental Active Learning for Optimal Generalization

Neural Computation
Active Learning in Approximately Linear Regression Based on Conditional Expectation of Generalization Error

The Journal of Machine Learning Research
Pool-based active learning with optimal sampling distribution and its information geometrical interpretation

Neurocomputing
A batch ensemble approach to active learning with model selection

Neural Networks
Pool-based active learning in approximate linear regression

Machine Learning
Importance weighted active learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Active learning with statistical models

Journal of Artificial Intelligence Research
Efficient exploration through active learning for value function approximation in reinforcement learning

Neural Networks
Machine Learning in Non-Stationary Environments: Introduction to Covariate Shift Adaptation

Machine Learning in Non-Stationary Environments: Introduction to Covariate Shift Adaptation
Statistical active learning in multilayer perceptrons

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Pool-based batch active learning is aimed at choosing training inputs from a 'pool' of test inputs so that the generalization error is minimized. P-ALICE (Pool-based Active Learning using Importance-weighted least-squares learning based on Conditional Expectation of the generalization error) is a state-of-the-art method that can cope with model misspecification by weighting training samples according to the importance (i.e., the ratio of test and training input densities). However, importance estimation in the original P-ALICE is based on the assumption that the number of training samples to gather is small, which is not always true in practice. In this paper, we propose an alternative scheme for importance estimation based on the inclusion probability, and show its validity through numerical experiments.