Large-Scale Code Reuse in Open Source Software

  • Authors:
  • Audris Mockus

  • Affiliations:
  • Avaya Labs Research, USA

  • Venue:
  • FLOSS '07 Proceedings of the First International Workshop on Emerging Trends in FLOSS Research and Development
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

We are exploring the practice of large-scale reuse involving at least a group of source code files. Our research question is to determine the extent of such reuse occurring in open source projects, to identify the code that is reused the most, and to investigate patterns of large-scale reuse. We start by identifying a sample of projects involving all code in several large repositories of open source projects, all projects bundled with popular distributions of Linux and BSD, and several large individual projects. In the next step we obtain the source code and identify groups of files reused among projects and determine the code that is most widely reused in our sample. Our findings indicate that more than 50% of the files were used in more than one project. The most widely reused components were small and represented templates requiring major and minor modifications and a group of files reused without any change. Some widely reused components involved hundreds of files.