Software Clustering based on Information Loss Minimization

  • Authors:
  • Periklis Andritsos;Vassilios Tzerpos

  • Affiliations:
  • -;-

  • Venue:
  • WCRE '03 Proceedings of the 10th Working Conference on Reverse Engineering
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The majority of the algorithms in the software clusteringliterature utilize structural information in order to decomposelarge software systems. Other approaches, such as usingfile names or ownership information, have also demonstratedmerit. However, there is no intuitive way to combine informationobtained from these two different types of techniques.In this paper, we present an approach that combines structuraland non-structural information in an integrated fashion.LIMBO is a scalable hierarchical clustering algorithm basedon the minimization of information loss when clustering asoftware system.We apply LIMBO to two large software systems in a numberof experiments. The results indicate that this approachproduces valid and useful clusterings of large software systems.LIMBO can also be used to evaluate the usefulnessof various types of non-structural information to the softwareclustering process.