Mining Source Code for Structural Regularities

  • Authors:
  • Angela Lozano;Andy Kellens;Kim Mens;Gabriela Arevalo

  • Affiliations:
  • -;-;-;-

  • Venue:
  • WCRE '10 Proceedings of the 2010 17th Working Conference on Reverse Engineering
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

During software development, design rules and contracts in the source code are often encoded through regularities, such as API usage protocols, coding idioms and naming conventions. The structural regularities that govern a program can aid in comprehension and maintenance of the application, but are often implicit or undocumented. Tool support for extracting these regularities from the source code can provide developers useful insights. But building such tool support is not trivial, in particular, because the informal nature of regularities results in frequent deviations and exceptions to these regularities. We propose an automated approach, based on association rule mining, to discover the structural regularities that govern the source code of a software system. We chose this technique because of its resilience to exceptions. In general, tool support for mining regularities tends to discover a huge amount of rules, making interpretation of the results hard and time-consuming. To ease the interpretation, we reduce the results to a minimal canonical form, and group them to obtain a more rational description of the discovered regularities. As an initial feasibility study of our approach, we applied it on two open-source systems, namely Intensive (Smalltalk) and FreeCol (Java).