Using clustering to improve the structure of natural language requirements documents

  • Authors:
  • Alessio Ferrari;Stefania Gnesi;Gabriele Tolomei

  • Affiliations:
  • ISTI-CNR, Pisa, Italy;ISTI-CNR, Pisa, Italy;DAIS, Università Ca' Foscari Venezia, Italy

  • Venue:
  • REFSQ'13 Proceedings of the 19th international conference on Requirements Engineering: Foundation for Software Quality
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

[Context and motivation] System requirements are normally provided in the form of natural language documents. Such documents need to be properly structured, in order to ease the overall uptake of the requirements by the readers of the document. A structure that allows a proper understanding of a requirements document shall satisfy two main quality attributes: (i) requirements relatedness: each requirement is conceptually connected with the requirements in the same section; (ii) sections independence: each section is conceptually separated from the others. [Question/Problem] Automatically identifying the parts of the document that lack requirements relatedness and sections independence may help improve the document structure. [Principal idea/results] To this end, we define a novel clustering algorithm named Sliding Head-Tail Component (S-HTC). The algorithm groups together similar requirements that are contiguous in the requirements document. We claim that such algorithm allows discovering the structure of the document in the way it is perceived by the reader. If the structure originally provided by the document does not match the structure discovered by the algorithm, hints are given to identify the parts of the document that lack requirements relatedness and sections independence. [Contribution] We evaluate the effectiveness of the algorithm with a pilot test on a requirements standard of the railway domain (583 requirements).