Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
Clustering and Identifying Temporal Trends in Document Databases
ADL '00 Proceedings of the IEEE Advances in Digital Libraries 2000
The Journal of Machine Learning Research
Discovering evolutionary theme patterns from text: an exploration of temporal text mining
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
ICML '06 Proceedings of the 23rd international conference on Machine learning
Topics over time: a non-Markov continuous-time model of topical trends
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Joint latent topic models for text and citations
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Applying latent dirichlet allocation to group discovery in large graphs
Proceedings of the 2009 ACM symposium on Applied Computing
Scientific paper summarization using citation summary networks
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
The ACL Anthology Network corpus
NLPIR4DL '09 Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries
The web of topics: discovering the topology of topic evolution in a corpus
Proceedings of the 20th international conference on World wide web
Clustering scientific literature using sparse citation graph analysis
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Time-Aware Ranking in Dynamic Citation Networks
ICDMW '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining Workshops
Hi-index | 0.00 |
Understanding how research themes evolve over time in a research community is useful in many ways (e.g., revealing important milestones and discovering emerging major research trends). In this paper, we propose a novel way of analyzing literature citation to explore the research topics and the theme evolution by modeling article citation relations with a probabilistic generative model. The key idea is to represent a research paper by a ``bag of citations'' and model such a ``citation document'' with a probabilistic topic model. We explore the extension of a particular topic model, i.e., Latent Dirichlet Allocation~(LDA), for citation analysis, and show that such a Citation-LDA can facilitate discovering of individual research topics as well as the theme evolution from multiple related topics, both of which in turn lead to the construction of evolution graphs for characterizing research themes. We test the proposed citation-LDA on two datasets: the ACL Anthology Network(AAN) of natural language research literatures and PubMed Central(PMC) archive of biomedical and life sciences literatures, and demonstrate that Citation-LDA can effectively discover the evolution of research themes, with better formed topics than (conventional) Content-LDA.