Selective sampling on graphs for classification

Authors:
Quanquan Gu;Charu Aggarwal;Jialu Liu;Jiawei Han
Affiliations:
University of Illinois at Urbana-Champaign, Urbana, IL, USA;IBM T.J. Watson Research Center, Yorktown Heights, NY, USA;University of Illinois at Urbana-Champaign, Urbana, IL, USA;University of Illinois at Urbana-Champaign, Urbana, IL, USA
Venue:
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2013

Citing 15
Cited 0

Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
Matrix computations (3rd ed.)

Matrix computations (3rd ed.)
Selective Sampling Using the Query by Committee Algorithm

Machine Learning
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
Support Vector Machine Active Learning with Application sto Text Classification

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Convex Optimization

Convex Optimization
A Second-Order Perceptron Algorithm

SIAM Journal on Computing
Online learning over graphs

ICML '05 Proceedings of the 22nd international conference on Machine learning
Prediction, Learning, and Games

Prediction, Learning, and Games
Worst-Case Analysis of Selective Sampling for Linear Classification

The Journal of Machine Learning Research
Effective label acquisition for collective classification

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Robust bounds for classification via selective sampling

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Graph regularized transductive classification on heterogeneous information networks

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Online Learning and Online Convex Optimization

Foundations and Trends® in Machine Learning
Towards Active Learning on Graphs: An Error Bound Minimization Approach

ICDM '12 Proceedings of the 2012 IEEE 12th International Conference on Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Selective sampling is an active variant of online learning in which the learner is allowed to adaptively query the label of an observed example. The goal of selective sampling is to achieve a good trade-off between prediction performance and the number of queried labels. Existing selective sampling algorithms are designed for vector-based data. In this paper, motivated by the ubiquity of graph representations in real-world applications, we propose to study selective sampling on graphs. We first present an online version of the well-known Learning with Local and Global Consistency method (OLLGC). It is essentially a second-order online learning algorithm, and can be seen as an online ridge regression in the Hilbert space of functions defined on graphs. We prove its regret bound in terms of the structural property (cut size) of a graph. Based on OLLGC, we present a selective sampling algorithm, namely Selective Sampling with Local and Global Consistency (SSLGC), which queries the label of each node based on the confidence of the linear function on graphs. Its bound on the label complexity is also derived. We analyze the low-rank approximation of graph kernels, which enables the online algorithms scale to large graphs. Experiments on benchmark graph datasets show that OLLGC outperforms the state-of-the-art first-order algorithm significantly, and SSLGC achieves comparable or even better results than OLLGC while querying substantially fewer nodes. Moreover, SSLGC is overwhelmingly better than random sampling.