The Journal of Machine Learning Research
Discovering evolutionary theme patterns from text: an exploration of temporal text mining
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
ICML '06 Proceedings of the 23rd international conference on Machine learning
Topics over time: a non-Markov continuous-time model of topical trends
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Semantic clustering: Identifying topics in source code
Information and Software Technology
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Mining Eclipse Developer Contributions via Author-Topic Models
MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Mining business topics in source code using latent dirichlet allocation
ISEC '08 Proceedings of the 1st India software engineering conference
Proceedings of the 2008 international working conference on Mining software repositories
Do Crosscutting Concerns Cause Defects?
IEEE Transactions on Software Engineering
A theory of aspects as latent topics
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
An Application of Latent Dirichlet Allocation to Analyzing Software Evolution
ICMLA '08 Proceedings of the 2008 Seventh International Conference on Machine Learning and Applications
Studying the history of ideas using topic models
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies
Journal of the ACM (JACM)
Software traceability with topic modeling
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Bug localization using latent Dirichlet allocation
Information and Software Technology
Validating the Use of Topic Models for Software Evolution
SCAM '10 Proceedings of the 2010 10th IEEE Working Conference on Source Code Analysis and Manipulation
Using Relational Topic Models to capture coupling among classes in object-oriented software systems
ICSM '10 Proceedings of the 2010 IEEE International Conference on Software Maintenance
Security Trend Analysis with CVE Topic Models
ISSRE '10 Proceedings of the 2010 IEEE 21st International Symposium on Software Reliability Engineering
Semi-automatically extracting FAQs to improve accessibility of software development knowledge
Proceedings of the 34th International Conference on Software Engineering
DRETOM: developer recommendation based on topic models for bug resolution
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Mining textual requirements to assist architectural software design: a state of the art review
Artificial Intelligence Review
Analysis of collaborative writing processes using revision maps and probabilistic topic models
Proceedings of the Third International Conference on Learning Analytics and Knowledge
The MSR cookbook: mining a decade of research
Proceedings of the 10th Working Conference on Mining Software Repositories
Using topic models to understand the evolution of a software ecosystem
Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Improving software modularization via automated analysis of latent topics and dependencies
ACM Transactions on Software Engineering and Methodology (TOSEM)
Studying software evolution using topic models
Science of Computer Programming
Static test case prioritization using topic models
Empirical Software Engineering
Grey System Theory based prediction for topic trend on Internet
Engineering Applications of Artificial Intelligence
Hi-index | 0.00 |
Studying the evolution of topics (collections of co-occurring words) in a software project is an emerging technique to automatically shed light on how the project is changing over time: which topics are becoming more actively developed, which ones are dying down, or which topics are lately more error-prone and hence require more testing. Existing techniques for modeling the evolution of topics in software projects suffer from issues of data duplication, i.e., when the repository contains multiple copies of the same document, as is the case in source code histories. To address this issue, we propose the Diff model, which applies a topic model only to the changes of the documents in each version instead of to the whole document at each version. A comparative study with a state-of-the-art topic evolution model shows that the Diff model can detect more distinct topics as well as more sensitive and accurate topic evolutions, which are both useful for analyzing source code histories.