A few good predictions: selective node labeling in a social network

Authors:
Gaurish Chaudhari;Vashist Avadhanula;Sunita Sarawagi
Affiliations:
Indian Institute of Technology Bombay, Mumbai, India;Indian Institute of Technology Bombay, Mumbai, India;Indian Institute of Technology Bombay, Mumbai, India
Venue:
Proceedings of the 7th ACM international conference on Web search and data mining
Year:
2014

Citing 19
Cited 0

Enhanced hypertext categorization using hyperlinks

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Relational Dependency Networks

The Journal of Machine Learning Research
Classification in Networked Data: A Toolkit and a Univariate Case Study

The Journal of Machine Learning Research
The rendezvous algorithm: multiclass semi-supervised learning with Markov random walks

Proceedings of the 24th international conference on Machine learning
Effective label acquisition for collective classification

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Pseudolikelihood EM for Within-network Relational Learning

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles

Proceedings of the 18th international conference on World wide web
Weakly-supervised acquisition of labeled class instances using graph random walks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Probabilistic classification and clustering in relational data

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
You are who you know: inferring user profiles in online social networks

Proceedings of the third ACM international conference on Web search and data mining
Cautious Collective Classification

The Journal of Machine Learning Research
Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning

Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning
A latent variable model for geographic lexical variation

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
You are where you tweet: a content-based approach to geo-locating twitter users

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Unifying guilt-by-association approaches: theorems and fast algorithms

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II
Finding your friends and following them to where you are

Proceedings of the fifth ACM international conference on Web search and data mining
Discovering geographical topics in the twitter stream

Proceedings of the 21st international conference on World Wide Web
BlurMe: inferring and obfuscating user gender based on ratings

Proceedings of the sixth ACM conference on Recommender systems
Collective inference for network data with copula latent markov networks

Proceedings of the sixth ACM international conference on Web search and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many social network applications face the following problem: given a network G=(V,E) with labels on a small subset O \subset V of nodes and an optional set of features on nodes and edges, predict the labels of the remaining nodes. Much research has gone into designing learning models and inference algorithms for accurate predictions in this setting. However, a core hurdle to any prediction effort is that for many nodes there is insufficient evidence for inferring a label. We propose that instead of focusing on the impossible task of providing high accuracy over all nodes, we should focus on selectively making the few node predictions which will be correct with high probability. Any selective prediction strategy will require that the scores attached to node predictions be well-calibrated. Our evaluations show that existing prediction algorithms are poorly calibrated. We propose a new method of training a graphical model using a conditional likelihood objective that provides better calibration than the existing joint likelihood objective. We augment it with a decoupled confidence model created using a novel unbiased training process. Empirical evaluation on two large social networks show that we are able to select a large number of predictions with accuracy as high as 95%, even when the best overall accuracy is only 40%.