The impact of identifier style on effort and comprehension

  • Authors:
  • Dave Binkley;Marcia Davis;Dawn Lawrie;Jonathan I. Maletic;Christopher Morrell;Bonita Sharif

  • Affiliations:
  • Department of Computer Science, Loyola University Maryland, Baltimore, USA 21210-2699;Center for Social Organization of Schools, Johns Hopkins University, Baltimore, USA 21218;Department of Computer Science, Loyola University Maryland, Baltimore, USA 21210-2699;Department of Computer Science, Kent State University, Kent, USA 44242;Department of Mathematics and Statistics, Loyola University Maryland, Baltimore, USA 21210-2699;Department of Computer Science and Information Systems, Youngstown State University, Youngstown, USA 44555

  • Venue:
  • Empirical Software Engineering
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

A family of studies investigating the impact of program identifier style on human comprehension is presented. Two popular identifier styles are examined, namely camel case and underscore. The underlying hypothesis is that identifier style affects the speed and accuracy of comprehending source code. To investigate this hypothesis, five studies were designed and conducted. The first study, which investigates how well humans read identifiers in the two different styles, focuses on low-level readability issues. The remaining four studies build on the first to focus on the semantic implications of identifier style. The studies involve 150 participants with varied demographics from two different universities. A range of experimental methods is used in the studies including timed testing, read aloud, and eye tracking. These methods produce a broad set of measurements and appropriate statistical methods, such as regression models and Generalized Linear Mixed Models (GLMMs), are applied to analyze the results. While unexpected, the results demonstrate that the tasks of reading and comprehending source code is fundamentally different from those of reading and comprehending natural language. Furthermore, as the task becomes similar to reading prose, the results become similar to work on reading natural language text. For more "source focused" tasks, experienced software developers appear to be less affected by identifier style; however, beginners benefit from the use of camel casing with respect to accuracy and effort.