An empirical study on the impact of duplicate code

Authors:
Keisuke Hotta;Yui Sasaki;Yukiko Sano;Yoshiki Higo;Shinji Kusumoto
Affiliations:
Graduate School of Information Science and Technology, Osaka University, Osaka, Japan;Graduate School of Information Science and Technology, Osaka University, Osaka, Japan;Graduate School of Information Science and Technology, Osaka University, Osaka, Japan;Graduate School of Information Science and Technology, Osaka University, Osaka, Japan;Graduate School of Information Science and Technology, Osaka University, Osaka, Japan
Venue:
Advances in Software Engineering - Special issue on Software Quality Assurance Methodologies and Techniques
Year:
2012

Citing 17
Cited 0

Does Code Decay? Assessing the Evidence from Change Management Data

IEEE Transactions on Software Engineering
CCFinder: a multilinguistic token-based code clone detection system for large scale source code

IEEE Transactions on Software Engineering
Software Quality Analysis by Code Clones in Industrial Legacy Software

METRICS '02 Proceedings of the 8th International Symposium on Software Metrics
Evaluating Clone Detection Tools for Use during Preventative Maintenance

SCAM '02 Proceedings of the Second IEEE International Workshop on Source Code Analysis and Manipulation
A Language Independent Approach for Detecting Duplicated Code

ICSM '99 Proceedings of the IEEE International Conference on Software Maintenance
How Effective Developers Investigate Source Code: An Exploratory Study

IEEE Transactions on Software Engineering
An empirical study of code clone genealogies

Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering
Evaluating the Harmfulness of Cloning: A Change Based Experiment

MSR '07 Proceedings of the Fourth International Workshop on Mining Software Repositories
Comparison and Evaluation of Clone Detection Tools

IEEE Transactions on Software Engineering
"Cloning considered harmful" considered harmful: patterns of cloning in software

Empirical Software Engineering
Evolution of Type-1 Clones

SCAM '09 Proceedings of the 2009 Ninth IEEE International Working Conference on Source Code Analysis and Manipulation
Is duplicate code more frequently modified than non-duplicate code in software evolution?: an empirical study on open source software

Proceedings of the Joint ERCIM Workshop on Software Evolution (EVOL) and International Workshop on Principles of Software Evolution (IWPSE)
Clone Stability

CSMR '11 Proceedings of the 2011 15th European Conference on Software Maintenance and Reengineering
Code Clone Detection on Specialized PDGs with Heuristics

CSMR '11 Proceedings of the 2011 15th European Conference on Software Maintenance and Reengineering
Is cloned code older than non-cloned code?

Proceedings of the 5th International Workshop on Software Clones
Frequency and risks of changes to clones

Proceedings of the 33rd International Conference on Software Engineering
An empirical study on inconsistent changes to code clones at the release level

Science of Computer Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is said that the presence of duplicate code is one of the factors that make software maintenance more difficult. Many research efforts have been performed on detecting, removing, or managing duplicate code on this basis. However, some researchers doubt this basis in recent years and have conducted empirical studies to investigate the influence of the presence of duplicate code. In this study, we conduct an empirical study to investigate this matter from a different standpoint from previous studies. In this study, we define a new indicator "modification frequency" tomeasure the impact of duplicate code and compare the values between duplicate code and nonduplicate code. The features of this study are as follows the indicator used in this study is based on modification places instead of the ratio of modified lines; we use multiple duplicate code detection tools to reduce biases of detection tools; and we compare the result of the proposed method with other two investigation methods. The result shows that duplicate code tends to be less frequently modified than nonduplicate code, and we found some instances that the proposed method can evaluate the influence of duplicate code more accurately than the existing investigation methods.