A comprehensive characterization of NLP techniques for identifying equivalent requirements

Authors:
Davide Falessi;Giovanni Cantone;Gerardo Canfora
Affiliations:
Simula Research Laboratory (Norway) and University of Rome TorVergata, Rome, Italy;University of Rome TorVergata, Rome, Italy;RCOST -- Research Centre on Software Technology and University of Sannio, Benevento, Italy
Venue:
Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement
Year:
2010

Citing 25
Cited 0

Software reuse through information retrieval

ACM SIGIR Forum
Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
An algorithm for suffix stripping

Readings in information retrieval
A Procedure for Analyzing Unbalanced Datasets

IEEE Transactions on Software Engineering
Experimentation in software engineering: an introduction

Experimentation in software engineering: an introduction
Recovering Traceability Links between Code and Documentation

IEEE Transactions on Software Engineering
Recovering documentation-to-source-code traceability links using latent semantic indexing

Proceedings of the 25th International Conference on Software Engineering
Identifying the Starting Impact Set of a Maintenance Request: A Case Study

CSMR '00 Proceedings of the Conference on Software Maintenance and Reengineering
Traceability Recovery by Modeling Programmer Behavior

WCRE '00 Proceedings of the Seventh Working Conference on Reverse Engineering (WCRE'00)
Traceability Recovery in RAD Software Systems

IWPC '02 Proceedings of the 10th International Workshop on Program Comprehension
A Linguistic-Engineering Approach to Large-Scale Requirements Management

IEEE Software
Utilizing Supporting Evidence to Improve Dynamic Requirements Traceability

RE '05 Proceedings of the 13th IEEE International Conference on Requirements Engineering
Advancing Candidate Link Generation for Requirements Tracing: The Study of Methods

IEEE Transactions on Software Engineering
Can LSI help Reconstructing Requirements Traceability in Design and Test?

CSMR '06 Proceedings of the Conference on Software Maintenance and Reengineering
An experiment on linguistic tool support for consolidation of requirements from multiple sources in market-driven product development

Empirical Software Engineering
Incremental Approach and User Feedbacks: a Silver Bullet for Traceability Recovery

ICSM '06 Proceedings of the 22nd IEEE International Conference on Software Maintenance
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Detection of Duplicate Defect Reports Using Natural Language Processing

ICSE '07 Proceedings of the 29th international conference on Software Engineering
Feature Location Using Probabilistic Ranking of Methods Based on Execution Scenarios and Information Retrieval

IEEE Transactions on Software Engineering
Recovering traceability links in software artifact management systems using information retrieval methods

ACM Transactions on Software Engineering and Methodology (TOSEM)
Clustering support for automated tracing

Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering
Introduction to Information Retrieval

Introduction to Information Retrieval
Assessing IR-based traceability recovery tools through controlled experiments

Empirical Software Engineering
Peaceful Coexistence: Agile Developer Perspectives on Software Architecture

IEEE Software
Decision-making techniques for software architecture design: A comparative survey

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Though very important in software engineering, linking artifacts of the same type (clone detection) or of different types (traceability recovery) is extremely tedious, error-prone and requires significant effort. Past research focused on supporting analysts with mechanisms based on Natural Language Processing (NLP) to identify candidate links. Because a plethora of NLP techniques exists, and their performances vary among contexts, it is important to characterize them according to the provided level of support. The aim of this paper is to characterize a comprehensive set of NLP techniques according to the provided level of support to human analysts in detecting equivalent requirements. The characterization consists on a case study, featuring real requirements, in the context of an Italian company in the defense and aerospace domain. The major result from the case study is that simple NLP are more precise than complex ones.