An efficient active learning method based on random sampling and backward deletion

Authors:
Hoyoung Woo;Cheong Hee Park
Affiliations:
Dept. of Computer Science and Engineering, Chungnam National University, Yuseong-gu, Korea;Dept. of Computer Science and Engineering, Chungnam National University, Yuseong-gu, Korea
Venue:
IScIDE'12 Proceedings of the third Sino-foreign-interchange conference on Intelligent Science and Intelligent Data Engineering
Year:
2012

Citing 5
Cited 0

Selective Sampling Using the Query by Committee Algorithm

Machine Learning
Support vector machine active learning with applications to text classification

The Journal of Machine Learning Research
Introduction to Data Mining, (First Edition)

Introduction to Data Mining, (First Edition)
Active learning with statistical models

Journal of Artificial Intelligence Research
Active Learning Based on Locally Linear Reconstruction

IEEE Transactions on Pattern Analysis and Machine Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Active learning aims to select data samples which would be the most informative to improve classification performance so that their class labels are obtained from an expert. Recently, an active learning method based on locally linear reconstruction(LLR) has been proposed and the performance of LLR was demonstrated well in the experiments comparing with other active learning methods. However, the time complexity of LLR is very high due to matrix operations required repeatedly for data selection. In this paper, we propose an efficient active learning method based on random sampling and backward deletion. We select a small subset of data samples by random sampling from the total data set, and a process of deleting the most redundant points in the subset is performed iteratively by searching for a pair of data samples having the smallest distance. The distance measure using a graph-based shortest path distance is utilized in order to consider the underlying data distribution. Experimental results demonstrate that the proposed method has very low time complexity, but the prediction power of data samples selected by our method outperforms that by LLR.