Community detection by popularity based models for authored networked data

Authors:
Tianbao Yang;Prakash Mandaym Comar;Linli Xu
Affiliations:
GE Global Research, San Ramon, CA;Michigan State University, East Lansing, MI;University of Science and Technology of China, Hefei, Anhui, China
Venue:
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Year:
2013

Citing 17
Cited 0

On the limited memory BFGS method for large scale optimization

Mathematical Programming: Series A and B
Spectral K-way ratio-cut partitioning and clustering

DAC '93 Proceedings of the 30th international Design Automation Conference
Historical development of the Newton-Raphson method

SIAM Review
A multilevel algorithm for partitioning graphs

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
Automating the Construction of Internet Portals with Machine Learning

Information Retrieval
Learning to Probabilistically Identify Authoritative Documents

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Normalized Cuts and Image Segmentation

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Latent dirichlet allocation

The Journal of Machine Learning Research
The author-topic model for authors and documents

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Joint latent topic models for text and citations

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Binary Matrix Factorization with Applications

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
RankClus: integrating clustering with ranking for heterogeneous information network analysis

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Topic-link LDA: joint models of topic and author community

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Ranking-based clustering of heterogeneous information networks with star network schema

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Combining link and content for community detection: a discriminative approach

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Community detection via heterogeneous interaction analysis

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Community detection has emerged as an attractive topic due to the increasing need to understand and manage the networked data of tremendous magnitude. Networked data usually consists of links between the entities and the attributes for describing the entities. Various approaches have been proposed for detecting communities by utilizing the link information and/or attribute information. In this work, we study the problem of community detection for networked data with additional authorship information. By authorship, each entity in the network is authored by another type of entities (e.g., wiki pages are edited by users, products are purchased by customers), to which we refer as authors. Communities of entities are affected by their authors, e.g., two entities that are associated with the same author tend to belong to the same community. Therefore leveraging the authorship information would help us better detect the communities in the networked data. However, it also brings new challenges to community detection. The foremost question is how to model the correlation between communities and authorships. In this work, we address this question by proposing probabilistic models based on the popularity link model [1], which is demonstrated to yield encouraging results for community detection. We employ two methods for modeling the authorships: (i) the first one generates the authorships independently from links by community memberships and popularities of authors by analogy of the popularity link model; (ii) the second one models the links between entities based on authorships together with community memberships and popularities of nodes, which is an analog of previous author-topic model. Upon the basic models, we explore several extensions including (i) we model the community memberships of authors by that of their authored entities to reduce the number of redundant parameters; and (ii) we model the communities memberships of entities and/or authors by their attributes using a discriminative approach. We demonstrate the effectiveness of the proposed models by empirical studies.