Combining link and content for collective active learning

Authors:
Lixin Shi;Yuhang Zhao;Jie Tang
Affiliations:
Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China
Venue:
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Year:
2010

Citing 8
Cited 2

Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Why collective inference improves relational classification

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Semi-supervised learning with graphs

Semi-supervised learning with graphs
Classification in Networked Data: A Toolkit and a Univariate Case Study

The Journal of Machine Learning Research
Exploiting Network Structure for Active Inference in Collective Classification

ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
Using graph-based metrics with empirical risk minimization to speed up active learning on networked data

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
A large-scale active learning system for topical categorization on the web

Proceedings of the 19th international conference on World wide web
Learning continuous time bayesian networks

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence

Towards feature selection in network

Proceedings of the 20th ACM international conference on Information and knowledge management
Leveraging relationships in social networks for sentiment analysis

Proceedings of the 18th Brazilian symposium on Multimedia and the web

Quantified Score

Hi-index	0.01

Visualization

Abstract

In this paper, we study a novel problem Collective Active Learning, in which we aim to select a batch set of "informative" instances from a networking data set to query the user in order to improve the accuracy of the learned classification model. We perform a theoretical investigation of the problem and present three criteria (i.e., minimum redundancy, maximum uncertainty and maximum impact) to quantify the informativeness of a set of selected instances. We define an objective function based on the three criteria and present an efficient algorithm to optimize the objective function with a bounded approximation rate. Experimental results on a real-world data sets demonstrate the effectiveness of our proposed approach.