Using ghost edges for classification in sparsely labeled networks

Authors:
Brian Gallagher;Hanghang Tong;Tina Eliassi-Rad;Christos Faloutsos
Affiliations:
Lawrence Livermore National Laboratory, Livermore, CA, USA;Carnegie Mellon University, Pittsburgh, PA, USA;Lawrence Livermore National Laboratory, Livermore, CA, USA;Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2008

Citing 12
Cited 30

Enhanced hypertext categorization using hyperlinks

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Random Forests

Machine Learning
Learning probabilistic models of link structure

The Journal of Machine Learning Research
Learning relational probability trees

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Reality mining: sensing complex social systems

Personal and Ubiquitous Computing
Fast Random Walk with Restart and Its Applications

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Relational Dependency Networks

The Journal of Machine Learning Research
Classification in Networked Data: A Toolkit and a Univariate Case Study

The Journal of Machine Learning Research
Improving learning in networked data by combining explicit and mined links

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Cautious inference in collective classification

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Discriminative probabilistic models for relational data

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence

SNAKDD 2008 social network mining and analysis postworkshop report

ACM SIGKDD Explorations Newsletter
Relational learning via latent social dimensions

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Within-Network Classification Using Local Structure Similarity

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Applying Electromagnetic Field Theory Concepts to Clustering with Constraints

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Role of weak ties in link prediction of complex networks

Proceedings of the 1st ACM international workshop on Complex networks meet information & knowledge management
An Iterative Learning Algorithm for Within-Network Regression in the Transductive Setting

DS '09 Proceedings of the 12th International Conference on Discovery Science
Homophily of Neighborhood in Graph Relational Classifier

SOFSEM '10 Proceedings of the 36th Conference on Current Trends in Theory and Practice of Computer Science
Cautious Collective Classification

The Journal of Machine Learning Research
Multi-network fusion for collective inference

Proceedings of the Eighth Workshop on Mining and Learning with Graphs
Leveraging label-independent features for classification in sparsely labeled networks: an empirical study

SNAKDD'08 Proceedings of the Second international conference on Advances in social network mining and analysis
Label-dependent feature extraction in social networks for node classification

SocInfo'10 Proceedings of the Second international conference on Social informatics
A method of label-dependent feature extraction in social networks

ICCCI'10 Proceedings of the Second international conference on Computational collective intelligence: technologies and applications - Volume Part II
A multi-resolution approach to learning with overlapping communities

Proceedings of the First Workshop on Social Media Analytics
From bias to opinion: a transfer-learning approach to real-time sentiment analysis

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
On the semantic annotation of places in location-based social networks

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
It's who you know: graph mining using recursive structural features

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Network regression with predictive clustering trees

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Informed prediction with incremental core-based friend cycle discovering

WAIM'11 Proceedings of the 12th international conference on Web-age information management
Leveraging social media networks for classification

Data Mining and Knowledge Discovery
Graffiti: graph-based classification in heterogeneous networks

World Wide Web
Efficient personalized pagerank with accuracy assurance

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
RolX: structural role extraction & mining in large graphs

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Nearly exact mining of frequent trees in large networks

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Maximum consistency preferential random walks

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
A modified random walk framework for handling negative ratings and generating explanations

ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on twitter and microblogging services, social recommender systems, and CAMRa2010: Movie recommendation in context
Transforming graph data for statistical relational learning

Journal of Artificial Intelligence Research
Where's the Money? The Social Behavior of Investors in Facebook's Small World

ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Link Prediction Using BenefitRanks in Weighted Networks

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Using social network knowledge for detecting spider constructions in social security fraud

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Single network relational transductive learning

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the problem of classification in partially labeled networks (a.k.a. within-network classification) where observed class labels are sparse. Techniques for statistical relational learning have been shown to perform well on network classification tasks by exploiting dependencies between class labels of neighboring nodes. However, relational classifiers can fail when unlabeled nodes have too few labeled neighbors to support learning (during training phase) and/or inference (during testing phase). This situation arises in real-world problems when observed labels are sparse. In this paper, we propose a novel approach to within-network classification that combines aspects of statistical relational learning and semi-supervised learning to improve classification performance in sparse networks. Our approach works by adding "ghost edges" to a network, which enable the flow of information from labeled to unlabeled nodes. Through experiments on real-world data sets, we demonstrate that our approach performs well across a range of conditions where existing approaches, such as collective classification and semi-supervised learning, fail. On all tasks, our approach improves area under the ROC curve (AUC) by up to 15 points over existing approaches. Furthermore, we demonstrate that our approach runs in time proportional to L • E, where L is the number of labeled nodes and E is the number of edges.