Identifying and characterizing change-prone classes in two large-scale open-source products

  • Authors:
  • A. Güneş Koru;Hongfang Liu

  • Affiliations:
  • Department of Information Systems, University of Maryland, Baltimore County, UMBC-EASEL, Empirical and Applied Software Engineering Laboratory, 1000 Hilltop Circle, Baltimore, MD 21250, USA;Georgetown University Medical Center, Department of Biostatistics, Bioinformatics, and Biomathematics, 4000 Reservoir Road, NW Suite 120, Washington, DC 20007, USA

  • Venue:
  • Journal of Systems and Software
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Developing and maintaining open-source software has become an important source of profit for many companies. Change-prone classes in open-source products increase project costs by requiring developers to spend effort and time. Identifying and characterizing change-prone classes can enable developers to focus timely preventive actions, for example, peer-reviews and inspections, on the classes with similar characteristics in the future releases or products. In this study, we collected a set of static metrics and change data at class level from two open-source projects, KOffice and Mozilla. Using these data, we first tested and validated Pareto's Law which implies that a great majority (around 80%) of change is rooted in a small proportion (around 20%) of classes. Then, we identified and characterized the change-prone classes in the two products by producing tree-based models. In addition, using tree-based models, we suggested a prioritization strategy to use project resources for focused preventive actions in an efficient manner. Our empirical results showed that this strategy was effective for prioritization purposes. This study should provide useful guidance to practitioners involved in development and maintenance of large-scale open-source products.