An empirical study on the impact of duplicate code

  • Authors:
  • Keisuke Hotta;Yui Sasaki;Yukiko Sano;Yoshiki Higo;Shinji Kusumoto

  • Affiliations:
  • Graduate School of Information Science and Technology, Osaka University, Osaka, Japan;Graduate School of Information Science and Technology, Osaka University, Osaka, Japan;Graduate School of Information Science and Technology, Osaka University, Osaka, Japan;Graduate School of Information Science and Technology, Osaka University, Osaka, Japan;Graduate School of Information Science and Technology, Osaka University, Osaka, Japan

  • Venue:
  • Advances in Software Engineering - Special issue on Software Quality Assurance Methodologies and Techniques
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is said that the presence of duplicate code is one of the factors that make software maintenance more difficult. Many research efforts have been performed on detecting, removing, or managing duplicate code on this basis. However, some researchers doubt this basis in recent years and have conducted empirical studies to investigate the influence of the presence of duplicate code. In this study, we conduct an empirical study to investigate this matter from a different standpoint from previous studies. In this study, we define a new indicator "modification frequency" tomeasure the impact of duplicate code and compare the values between duplicate code and nonduplicate code. The features of this study are as follows the indicator used in this study is based on modification places instead of the ratio of modified lines; we use multiple duplicate code detection tools to reduce biases of detection tools; and we compare the result of the proposed method with other two investigation methods. The result shows that duplicate code tends to be less frequently modified than nonduplicate code, and we found some instances that the proposed method can evaluate the influence of duplicate code more accurately than the existing investigation methods.