Community detection in content-sharing social networks

Authors:
Nagarajan Natarajan;Prithviraj Sen;Vineet Chaoji
Affiliations:
UT Austin;IBM Research - Almaden;Amazon, Bangalore
Venue:
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Year:
2013

Citing 19
Cited 0

Latent dirichlet allocation

The Journal of Machine Learning Research
The author-topic model for authors and documents

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Probabilistic models for discovering e-communities

Proceedings of the 15th international conference on World Wide Web
Sampling from large graphs

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
The link-prediction problem for social networks

Journal of the American Society for Information Science and Technology
Unsupervised prediction of citation influences

Proceedings of the 24th international conference on Machine learning
Joint latent topic models for text and citations

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Fast collapsed gibbs sampling for latent dirichlet allocation

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Topic-link LDA: joint models of topic and author community

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Topic and role discovery in social networks with experiments on enron and academic email

Journal of Artificial Intelligence Research
Operations for learning with graphical models

Journal of Artificial Intelligence Research
Topic and role discovery in social networks

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Kronecker Graphs: An Approach to Modeling Networks

The Journal of Machine Learning Research
What is Twitter, a social network or a news media?

Proceedings of the 19th international conference on World wide web
Supervised Link Prediction Using Multiple Sources

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Patterns of temporal variation in online media

Proceedings of the fourth ACM international conference on Web search and data mining
Using content and interactions for discovering communities in social networks

Proceedings of the 21st international conference on World Wide Web
Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
User Features and Social Networks for Topic Modeling in Online Social Media

ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Network structure and content in microblogging sites like Twitter influence each other ---user A on Twitter follows user B for the tweets that B posts on the network, and A may then re-tweet the content shared by B to his/her own followers. In this paper, we propose a probabilistic model to jointly model link communities and content topics by leveraging both the social graph and the content shared by users. We model a community as a distribution over users, use it as a source for topics of interest, and jointly infer both communities and topics using Gibbs sampling. While modeling communities using the social graph, or modeling topics using content have received a great deal of attention, a few recent approaches try to model topics in content-sharing platforms using both content and social graph. Our work differs from the existing generative models in that we explicitly model the social graph of users along with the user-generated content, mimicking how the two entities co-evolve in content-sharing platforms. Recent studies have found Twitter to be more of a content-sharing network and less a social network, and it seems hard to detect tightly knit communities from the follower-followee links. Still, the question of whether we can extract Twitter communities using both links and content is open. In this paper, we answer this question in the affirmative. Our model discovers coherent communities and topics, as evinced by qualitative results on sub-graphs of Twitter users. Furthermore, we evaluate our model on the task of predicting follower-followee links. We show that joint modeling of links and content significantly improves link prediction performance on a sub-graph of Twitter (consisting of about 0.7 million users and over 27 million tweets), compared to generative models based on only structure or only content and paths-based methods such as Katz.