Improving software modularization via automated analysis of latent topics and dependencies

  • Authors:
  • Gabriele Bavota;Malcom Gethers;Rocco Oliveto;Denys Poshyvanyk;Andrea de Lucia

  • Affiliations:
  • University of Salerno, Fisciano (SA), Italy;University of Maryland, Baltimore County, Baltimore, MD;University of Molise, Pesche (IS), Italy;The College of William and Mary, Williamsburg, VA;University of Salerno, Fisciano (SA), Italy

  • Venue:
  • ACM Transactions on Software Engineering and Methodology (TOSEM)
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Oftentimes, during software maintenance the original program modularization decays, thus reducing its quality. One of the main reasons for such architectural erosion is suboptimal placement of source-code classes in software packages. To alleviate this issue, we propose an automated approach to help developers improve the quality of software modularization. Our approach analyzes underlying latent topics in source code as well as structural dependencies to recommend (and explain) refactoring operations aiming at moving a class to a more suitable package. The topics are acquired via Relational Topic Models (RTM), a probabilistic topic modeling technique. The resulting tool, coined as R3 (Rational Refactoring via RTM), has been evaluated in two empirical studies. The results of the first study conducted on nine software systems indicate that R3 provides a coupling reduction from 10% to 30% among the software modules. The second study with 62 developers confirms that R3 is able to provide meaningful recommendations (and explanations) for move class refactoring. Specifically, more than 70% of the recommendations were considered meaningful from a functional point of view.