Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A Metrics Suite for Object Oriented Design
IEEE Transactions on Software Engineering
On an equivalence between PLSI and LDA
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
An Information Retrieval Approach to Concept Location in Source Code
WCRE '04 Proceedings of the 11th Working Conference on Reverse Engineering
Journal of Systems and Software
Enriching Reverse Engineering with Semantic Clustering
WCRE '05 Proceedings of the 12th Working Conference on Reverse Engineering
Combining Probabilistic Ranking and Latent Semantic Indexing for Feature Identification
ICPC '06 Proceedings of the 14th IEEE International Conference on Program Comprehension
LDA-based document models for ad-hoc retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Assessing design instability in iterative (agile) object-oriented projects: Research Articles
Journal of Software Maintenance and Evolution: Research and Practice
Essential Dimensions of Latent Semantic Indexing (LSI)
HICSS '07 Proceedings of the 40th Annual Hawaii International Conference on System Sciences
IEEE Transactions on Software Engineering
Combining Formal Concept Analysis with Information Retrieval for Concept Location in Source Code
ICPC '07 Proceedings of the 15th IEEE International Conference on Program Comprehension
Mining concepts from code with probabilistic topic models
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Do Code and Comments Co-Evolve? On the Relation between Source Code and Comment Changes
WCRE '07 Proceedings of the 14th Working Conference on Reverse Engineering
Mining business topics in source code using latent dirichlet allocation
ISEC '08 Proceedings of the 1st India software engineering conference
Do Crosscutting Concerns Cause Defects?
IEEE Transactions on Software Engineering
Source Code Retrieval for Bug Localization Using Latent Dirichlet Allocation
WCRE '08 Proceedings of the 2008 15th Working Conference on Reverse Engineering
Is it a bug or an enhancement?: a text-based approach to classify change requests
CASCON '08 Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds
An examination of stability and reusability in highly iterative software
An examination of stability and reusability in highly iterative software
Source code retrieval for bug localization using latent dirichlet allocation, and its relationship to stability of agilely developed software
The TAME project: towards improvement-oriented software environments
IEEE Transactions on Software Engineering
Modeling the evolution of topics in source code histories
Proceedings of the 8th Working Conference on Mining Software Repositories
Recovering traceability links between source code and fixed bugs via patch analysis
Proceedings of the 6th International Workshop on Traceability in Emerging Forms of Software Engineering
Quantifying the similiarities between source code lexicons
Proceedings of the 49th Annual Southeast Regional Conference
Proceedings of the 50th Annual Southeast Regional Conference
Automatically detecting the quality of the query and its implications in IR-based concept location
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
A topic-based approach for narrowing the search space of buggy files from a bug report
ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering
Proceedings of the 34th International Conference on Software Engineering
Combining lexical and structural information for static bug localisation
International Journal of Computer Applications in Technology
DRETOM: developer recommendation based on topic models for bug resolution
Proceedings of the 8th International Conference on Predictive Models in Software Engineering
Semantic fault diagnosis: automatic natural-language fault descriptions
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Concept-based failure clustering
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Proceedings of the 2013 International Conference on Software Engineering
Proceedings of the 10th Working Conference on Mining Software Repositories
Automatically describing software faults
Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
Using topic models to understand the evolution of a software ecosystem
Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering
ACM SIGSOFT Software Engineering Notes
Static test case prioritization using topic models
Empirical Software Engineering
Hi-index | 0.00 |
Context: Some recent static techniques for automatic bug localization have been built around modern information retrieval (IR) models such as latent semantic indexing (LSI). Latent Dirichlet allocation (LDA) is a generative statistical model that has significant advantages, in modularity and extensibility, over both LSI and probabilistic LSI (pLSI). Moreover, LDA has been shown effective in topic model based information retrieval. In this paper, we present a static LDA-based technique for automatic bug localization and evaluate its effectiveness. Objective: We evaluate the accuracy and scalability of the LDA-based technique and investigate whether it is suitable for use with open-source software systems of varying size, including those developed using agile methods. Method: We present five case studies designed to determine the accuracy and scalability of the LDA-based technique, as well as its relationships to software system size and to source code stability. The studies examine over 300 bugs across more than 25 iterations of three software systems. Results: The results of the studies show that the LDA-based technique maintains sufficient accuracy across all bugs in a single iteration of a software system and is scalable to a large number of bugs across multiple revisions of two software systems. The results of the studies also indicate that the accuracy of the LDA-based technique is not affected by the size of the subject software system or by the stability of its source code base. Conclusion: We conclude that an effective static technique for automatic bug localization can be built around LDA. We also conclude that there is no significant relationship between the accuracy of the LDA-based technique and the size of the subject software system or the stability of its source code base. Thus, the LDA-based technique is widely applicable.