The Journal of Machine Learning Research
Fully automatic cross-associations
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Graph evolution: Densification and shrinking diameters
ACM Transactions on Knowledge Discovery from Data (TKDD)
Learning systems of concepts with an infinite relational model
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Statistical models of music-listening sessions in social media
Proceedings of the 19th international conference on World wide web
Profiling-By-Association: a resilient traffic profiling solution for the internet backbone
Proceedings of the 6th International COnference
Supervised random walks: predicting and recommending links in social networks
Proceedings of the fourth ACM international conference on Web search and data mining
Literature search through mixed-membership community discovery
SBP'10 Proceedings of the Third international conference on Social Computing, Behavioral Modeling, and Prediction
Social-network analysis using topic models
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Incorporating popularity in topic models for social network analysis
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Understanding evolution of research themes: a probabilistic generative model for citations
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
On the approximability of the link building problem
Theoretical Computer Science
Hi-index | 0.00 |
This paper introduces LDA-G, a scalable Bayesian approach to finding latent group structures in large real-world graph data. Existing Bayesian approaches for group discovery (such as Infinite Relational Models) have only been applied to small graphs with a couple of hundred nodes. LDA-G (short for Latent Dirichlet Allocation for Graphs) utilizes a well-known topic modeling algorithm to find latent group structure. Specifically, we modify Latent Dirichlet Allocation (LDA) to operate on graph data instead of text corpora. Our modifications reflect the differences between real-world graph data and text corpora (e.g., a node's neighbor count vs. a document's word count). In our empirical study, we apply LDA-G to several large graphs (with thousands of nodes) from PubMed (a scientific publication repository). We compare LDA-G's quantitative performance on link prediction with two existing approaches: one Bayesian (namely, Infinite Relational Model) and one non-Bayesian (namely, Cross-association). On average, LDA-G outperforms IRM by 15% and Cross-association by 25% (in terms of area under the ROC curve). Furthermore, we demonstrate that LDA-G can discover useful qualitative information.