On the Use of Domain Terms in Source Code

  • Authors:
  • Sonia Haiduc;Andrian Marcus

  • Affiliations:
  • -;-

  • Venue:
  • ICPC '08 Proceedings of the 2008 The 16th IEEE International Conference on Program Comprehension
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Information about the problem domain of the software and the solution it implements is often embedded by developers in comments and identifiers. When using software developed by others or when are new to a project, programmers know little about how domain information is reflected in the source code. Programmers often learn about the domain from external sources such as books, articles, etc. Hence, it is important to use in comments and identifiers terms that are commonly known in the domain literature, as it is likely that programmers will use such terms when searching the source code. The paper presents a case study that investigated how domain terms are used in comments and identifiers. The study focused on three research questions: (1) to what degree are domain terms found in the source code of software from a particular problem domain?; (2) which is the preponderant source of domain terms: identifiers or comments?; and (3) to what degree are domain terms shared between several systems from the same problem domain? Within the studied software, we found that in average: 42% of the domain terms were used in the source code; 23% of the domain terms used in the source code are present in comments only, whereas only 11% in the identifiers alone, and there is a 63% agreement in the use of domain terms between any two software systems.