On the utility of abstraction in labeling actors in social networks

Authors:
Ngot Bui;Vasant Honavar
Affiliations:
Iowa State University, Ames, Iowa;Iowa State University, Ames, Iowa
Venue:
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Year:
2013

Citing 13
Cited 0

Distributional clustering of words for text classification

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Hierarchical classification of Web content

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Improving Text Classification by Shrinkage in a Hierarchy of Classes

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A Data-Clustering Algorithm on Distributed Memory Multiprocessors

Revised Papers from Large-Scale Parallel Data Mining, Workshop on Large-Scale Parallel KDD Systems, SIGKDD
Learning accurate and concise naïve Bayes classifiers from attribute value taxonomies and data

Knowledge and Information Systems
Classification in Networked Data: A Toolkit and a Univariate Case Study

The Journal of Machine Learning Research
Mixed Membership Stochastic Blockmodels

The Journal of Machine Learning Research
Scalable learning of collective behavior based on sparse social dimensions

Proceedings of the 18th ACM conference on Information and knowledge management
Learning similarity metrics for event identification in social media

Proceedings of the third ACM international conference on Web search and data mining
Semi-Supervised Classification of Network Data Using Very Few Labels

ASONAM '10 Proceedings of the 2010 International Conference on Advances in Social Networks Analysis and Mining
Ranking-based classification of heterogeneous information networks

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Graffiti: graph-based classification in heterogeneous networks

World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Social networks are naturally represented as heterogeneous networks with multiple types of objects e.g., actors, items and multiple types of links e.g., links between actors that denote social ties e.g., friendship, and links that connect actors to items e.g., photos, videos, articles, etc. that denote relationships between actors and items. In this paper, we consider the task of assigning labels to the unlabeled actors (individuals) in a large heterogeneous social network in which labels are available for a subset of actors. Specifically, we seek to learn a predictive model to label actors based on the attributes of the actors themselves and/or items that are linked to them in the network. Unfortunately, the number of distinct items, represented in real-world networks such as Facebook or Flickr is quite large (in the millions) although only a small subset of them are linked to specific actors. This leads to data sparsity which causes over-fitting and hence poor performance in predicting the labels of unlabeled actors. To address this problem, we induce hierarchical taxonomies over items and use the resulting taxonomies as a basis for selecting abstract and hence parsimonious representations of network data for learning the predictive models. Our experiments using three different predictors (Iterative classification Naïve Bayes, Iterative classification Logistic Regression, and EdgeCluster) on two real-world data sets, Last.fm and Flickr, show that the predictive models that take advantage of abstract representations of network data are competitive with, and in some cases, outperform those that do not.