Evaluating the specificity of text retrieval queries to support software engineering tasks

Authors:
Sonia Haiduc;Gabriele Bavota;Rocco Oliveto;Andrian Marcus;Andrea De Lucia
Affiliations:
Wayne State University, USA;University of Salerno, Italy;University of Molise, Italy;Wayne State University, USA;University of Salerno, Italy
Venue:
Proceedings of the 34th International Conference on Software Engineering
Year:
2012

Citing 10
Cited 5

Elements of information theory

Elements of information theory
Predicting query performance

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
An Information Retrieval Approach to Concept Location in Source Code

WCRE '04 Proceedings of the 11th Working Conference on Reverse Engineering
Advancing Candidate Link Generation for Requirements Tracing: The Study of Methods

IEEE Transactions on Software Engineering
Incremental Approach and User Feedbacks: a Silver Bullet for Traceability Recovery

ICSM '06 Proceedings of the 22nd IEEE International Conference on Software Maintenance
Feature location via information retrieval based filtering of a single scenario execution trace

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Partial Domain Comprehension in Software Evolution and Maintenance

ICPC '08 Proceedings of the 2008 The 16th IEEE International Conference on Program Comprehension
A machine learning approach for tracing regulatory codes to product specific requirements

Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
Estimating the Query Difficulty for Information Retrieval

Estimating the Query Difficulty for Information Retrieval
Towards mining replacement queries for hard-to-retrieve traces

Proceedings of the IEEE/ACM international conference on Automated software engineering

Automatic query performance assessment during the retrieval of software artifacts

Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering
Automatic query reformulations for text retrieval in software engineering

Proceedings of the 2013 International Conference on Software Engineering
Seahawk: stack overflow in the IDE

Proceedings of the 2013 International Conference on Software Engineering
Supporting concept location through identifier parsing and ontology extraction

Journal of Systems and Software
Knowledge-based approaches in software documentation: A systematic literature review

Information and Software Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text retrieval approaches have been used to address many software engineering tasks. In most cases, their use involves issuing a textual query to retrieve a set of relevant software artifacts from the system. The performance of all these approaches depends on the quality of the given query (i.e., its ability to describe the information need in such a way that the relevant software artifacts are retrieved during the search). Currently, the only way to tell that a query failed to lead to the expected software artifacts is by investing time and effort in analyzing the search results. In addition, it is often very difficult to ascertain what part of the query leads to poor results. We propose a novel pre-retrieval metric, which reflects the quality of a query by measuring the specificity of its terms. We exemplify the use of the new specificity metric on the task of concept location in source code. A preliminary empirical study shows that our metric is a good effort predictor for text retrieval-based concept location, outperforming existing techniques from the field of natural language document retrieval.