Asking generalized queries with minimum cost

Authors:
Jun Du;Charles X. Ling
Affiliations:
Department of Computer Science, The University of Western Ontario, London, Ontario, Canada;Department of Computer Science, The University of Western Ontario, London, Ontario, Canada
Venue:
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Year:
2011

Citing 9
Cited 0

Query by committee

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Toward Optimal Active Learning through Sampling Estimation of Error Reduction

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Support vector machine active learning with applications to text classification

The Journal of Machine Learning Research
Online Choice of Active Learning Algorithms

The Journal of Machine Learning Research
Active learning with statistical models

Journal of Artificial Intelligence Research
Selective supervision: guiding supervised learning with decision-theoretic active learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Active cost-sensitive learning

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Active Learning with Generalized Queries

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Previous works of active learning usually only ask specific queries. A more natural way is to ask generalized queries with don'tcare features. As each of such generalized queries can often represent a set of specific ones, the answers are usually more helpful in speeding up the learning process. However, despite of such advantages of the generalized queries, more expertise (or effort) is usually required for the oracle to provide accurate answers in real-world situations. Therefore, in this paper, we make a more realistic assumption that, the more general a query is, the higher querying cost it causes. This consequently yields a trade-off that, asking generalized queries can speed up the leaning, but usually with high cost; whereas, asking specific queries is much cheaper (with low cost), but the learning process might be slowed down. To resolve this issue, we propose two novel active learning algorithms for two scenarios: one to balance the predictive accuracy and the querying cost; and the other to minimize the total cost of misclassification and querying. We demonstrate that our new methods can significantly outperform the existing active learning algorithms in both of these two scenarios.