Program understanding and the concept assignment problem
Communications of the ACM
Recovering software architecture from the names of source files
Journal of Software Maintenance: Research and Practice
Recovering Traceability Links between Code and Documentation
IEEE Transactions on Software Engineering
What's the code?: automatic classification of source code archives
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Recovering documentation-to-source-code traceability links using latent semantic indexing
Proceedings of the 25th International Conference on Software Engineering
Nomen Est Omen: Analyzing the Language of Function Identifiers
WCRE '99 Proceedings of the Sixth Working Conference on Reverse Engineering
A comparison of methods for locating features in legacy software
Journal of Systems and Software
Identification of High-Level Concept Clones in Source Code
Proceedings of the 16th IEEE international conference on Automated software engineering
The Journal of Machine Learning Research
Probabilistic author-topic models for information discovery
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
MUDABlue: An Automatic Categorization System for Open Source Repositories
APSEC '04 Proceedings of the 11th Asia-Pacific Software Engineering Conference
An Information Retrieval Approach to Concept Location in Source Code
WCRE '04 Proceedings of the 11th Working Conference on Reverse Engineering
The CodeSurfer Software Understanding Platform
IWPC '05 Proceedings of the 13th International Workshop on Program Comprehension
SNIAFL: Towards a static noninteractive approach to feature location
ACM Transactions on Software Engineering and Methodology (TOSEM)
Semantic clustering: Identifying topics in source code
Information and Software Technology
Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code
ICPC '07 Proceedings of the 15th IEEE International Conference on Program Comprehension
Topic and role discovery in social networks
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Expectation-propagation for the generative aspect model
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Analyzing entities and topics in news articles using statistical topic models
ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
Automatically capturing source code context of NL-queries for software maintenance and reuse
ICSE '09 Proceedings of the 31st International Conference on Software Engineering
Discovery of architectural layers and measurement of layering violations in source code
Journal of Systems and Software
Software traceability with topic modeling
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Bug localization using latent Dirichlet allocation
Information and Software Technology
Separation of scattered concerns: a graph based approach for aspect mining
ACM SIGSOFT Software Engineering Notes
Modeling the evolution of topics in source code histories
Proceedings of the 8th Working Conference on Mining Software Repositories
Automatically detecting and describing high level actions within methods
Proceedings of the 33rd International Conference on Software Engineering
Applying a dynamic threshold to improve cluster detection of LSI
Science of Computer Programming
Mr. LDA: a flexible large scale topic modeling package using variational inference in MapReduce
Proceedings of the 21st international conference on World Wide Web
Analyzing and mining a code search engine usage log
Empirical Software Engineering
Latent Business Networks Mining: A Probabilistic Generative Model
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Using topic models to understand the evolution of a software ecosystem
Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Commonsense-based topic modeling
Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining
Studying software evolution using topic models
Science of Computer Programming
Static test case prioritization using topic models
Empirical Software Engineering
Hi-index | 0.00 |
One of the difficulties in maintaining a large software system is the absence of documented business domain topics and correlation between these domain topics and source code. Without such a correlation, people without any prior application knowledge would find it hard to comprehend the functionality of the system. Latent Dirichlet Allocation (LDA), a statistical model, has emerged as a popular technique for discovering topics in large text document corpus. But its applicability in extracting business domain topics from source code has not been explored so far. This paper investigates LDA in the context of comprehending large software systems and proposes a human assisted approachbased on LDA for extracting domain topics from source code. This method has been applied on a number of open source and proprietary systems. Preliminary results indicate that LDA is able to identify some of the domain topics and isa satisfactory starting point for further manual refinement of topics