Do they belong to the same class: active learning by querying pairwise label homogeneity

Authors:
Yifan Fu;Bin Li;Xingquan Zhu;Chengqi Zhang
Affiliations:
Centre for Quantum Computation & Intelligent System, UTS, Sydney, Australia;Centre for Quantum Computation & Intelligent System, UTS, Sydney, Australia;Centre for Quantum Computation & Intelligent System, UTS, Sydney, Australia;Centre for Quantum Computation & Intelligent System, UTS, Sydney, Australia
Venue:
Proceedings of the 20th ACM international conference on Information and knowledge management
Year:
2011

Citing 7
Cited 0

Semi-supervised learning using randomized mincuts

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning from labeled features using generalized expectation criteria

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Get another label? improving data quality and data mining using multiple, noisy labelers

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Supervised learning from multiple experts: whom to trust when everyone lies a bit

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
An analysis of active learning strategies for sequence labeling tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Active learning with statistical models

Journal of Artificial Intelligence Research
Asking Generalized Queries to Domain Experts to Improve Learning

IEEE Transactions on Knowledge and Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional active learning methods request experts to provide ground truths to the queried instances, which can be expensive in practice. An alternative solution is to ask nonexpert labelers to do such labeling work, which can not tell the definite class labels. In this paper, we propose a new active learning paradigm, in which a nonexpert labeler is only asked "whether a pair of instances belong to the same class". To instantiate the proposed paradigm, we adopt the MinCut algorithm as the base classifier. We first construct a graph based on the pairwise distance of all the labeled and unlabeled instances and then repeatedly update the unlabeled edge weights on the max-flow paths in the graph. Finally, we select an unlabeled subset of nodes with the highest prediction confidence as the labeled data, which are included into the labeled data set to learn a new classifier for the next round of active learning. The experimental results and comparisons, with state-of-the-art methods, demonstrate that our active learning paradigm can result in good performance with nonexpert labelers.