A simpler model of software readability

  • Authors:
  • Daryl Posnett;Abram Hindle;Premkumar Devanbu

  • Affiliations:
  • University of California, Davis, Davis, USA;University of California, Davis, Davis, USA;University of California, Davis, Davis, USA

  • Venue:
  • Proceedings of the 8th Working Conference on Mining Software Repositories
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

Software readability is a property that influences how easily a given piece of code can be read and understood. Since readability can affect maintainability, quality, etc., programmers are very concerned about the readability of code. If automatic readability checkers could be built, they could be integrated into development tool-chains, and thus continually inform developers about the readability level of the code. Unfortunately, readability is a subjective code property, and not amenable to direct automated measurement. In a recently published study, Buse et al. asked 100 participants to rate code snippets by readability, yielding arguably reliable mean readability scores of each snippet; they then built a fairly complex predictive model for these mean scores using a large, diverse set of directly measurable source code properties. We build on this work: we present a simple, intuitive theory of readability, based on size and code entropy, and show how this theory leads to a much sparser, yet statistically significant, model of the mean readability scores produced in Buse's studies. Our model uses well-known size metrics and Halstead metrics, which are easily extracted using a variety of tools. We argue that this approach provides a more theoretically well-founded, practically usable, approach to readability measurement.