Software reuse through information retrieval
ACM SIGIR Forum
LaSSIE—a knowledge-based software information system
ICSE '90 Proceedings of the 12th international conference on Software engineering
An Information Retrieval Approach for Automatically Constructing Software Libraries
IEEE Transactions on Software Engineering
Bugs as deviant behavior: a general approach to inferring errors in systems code
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Recovering documentation-to-source-code traceability links using latent semantic indexing
Proceedings of the 25th International Conference on Software Engineering
Intelligent search techniques for large software systems
CASCON '01 Proceedings of the 2001 conference of the Centre for Advanced Studies on Collaborative research
A survey on the use of relevance feedback for information access systems
The Knowledge Engineering Review
Cluster-based retrieval using language models
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
An Information Retrieval Approach to Concept Location in Source Code
WCRE '04 Proceedings of the 11th Working Conference on Reverse Engineering
ACM SIGPLAN Notices
SOBER: statistical model-based bug localization
Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Lightweight bug localization with AMPLE
Proceedings of the sixth international symposium on Automated analysis-driven debugging
Empirical evaluation of the tarantula automatic fault-localization technique
Proceedings of the 20th IEEE/ACM international Conference on Automated software engineering
Semantic clustering: Identifying topics in source code
Information and Software Technology
Extraction of bug localization benchmarks from history
Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Source Code Retrieval for Bug Localization Using Latent Dirichlet Allocation
WCRE '08 Proceedings of the 2008 15th Working Conference on Reverse Engineering
Empirical Software Engineering
Mining source code to automatically split identifiers for software analysis
MSR '09 Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories
A static technique for fault localization using character n-gram based information retrieval model
Proceedings of the 5th India Software Engineering Conference
Proceedings of the 34th International Conference on Software Engineering
Identifying Linux bug fixing patches
Proceedings of the 34th International Conference on Software Engineering
Is text search an effective approach for fault localization: a practitioners perspective
Proceedings of the 3rd annual conference on Systems, programming, and applications: software for humanity
A hybrid bug triage algorithm for developer recommendation
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Assisting code search with automatic query reformulation for bug localization
Proceedings of the 10th Working Conference on Mining Software Repositories
The MSR cookbook: mining a decade of research
Proceedings of the 10th Working Conference on Mining Software Repositories
ACM SIGSOFT Software Engineering Notes
Hi-index | 0.00 |
From the standpoint of retrieval from large software libraries for the purpose of bug localization, we compare five generic text models and certain composite variations thereof. The generic models are: the Unigram Model (UM), the Vector Space Model (VSM), the Latent Semantic Analysis Model (LSA), the Latent Dirichlet Allocation Model (LDA), and the Cluster Based Document Model (CBDM). The task is to locate the files that are relevant to a bug reported in the form of a textual description by a software developer. We use for our study iBUGS, a benchmarked bug localization dataset with 75 KLOC and a large number of bugs (291). A major conclusion of our comparative study is that simple text models such as UM and VSM are more effective at correctly retrieving the relevant files from a library as compared to the more sophisticated models such as LDA. The retrieval effectiveness for the various models was measured using the following two metrics: (1) Mean Average Precision; and (2) Rank-based metrics. Using the SCORE metric, we also compare the retrieval effectiveness of the models in our study with some other bug localization tools.