Analyzing and inferring the structure of code change

Authors:
David Notkin;Miryung Kim
Affiliations:
University of Washington;University of Washington
Venue:
Analyzing and inferring the structure of code change
Year:
2008

Citing 0
Cited 3

Discovering and representing systematic code changes

ICSE '09 Proceedings of the 31st International Conference on Software Engineering
A language for software variation research

GPCE '10 Proceedings of the ninth international conference on Generative programming and component engineering
Automated documentation inference to explain failed tests

ASE '11 Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Programmers often need to reason about how a program evolved between two or more program versions. Reasoning about program changes is challenging as there is a significant gap between how programmers think about changes and how existing program differencing tools represent such changes. For example, even though modification of a locking protocol is conceptually simple and systematic at a code level, diff extracts scattered text additions and deletions per file. To enable programmers to reason about program differences at a high-level, this dissertation proposes an approach that automatically discovers and represents systematic changes as first order logic rules. This rule inference approach is based on the insight that high-level changes are often systematic at a code level and that first order logic rules can represent such systematic changes concisely. There are two similar but separate rule-inference techniques, each with its own kind of rules. The first kind captures systematic changes to application programming interface (API) names and signatures. The second kind captures systematic differences at the level of code elements (e.g., types, methods, and fields) and structural dependencies (e.g., method-calls and subtyping relationships). Both kinds of rules concisely represent systematic changes and explicitly note exceptions to systematic changes. Thus, software engineers can quickly get an overview of program differences and identify potential bugs caused by inconsistent updates. The viability of this approach is demonstrated through its application to several open source projects as well as a focus group study with professional software engineers from a large e-commerce company. This dissertation also presents empirical studies that motivated the rule-based change inference approach. It has been long believed that code clones syntactically similar code fragments—indicate poor software design and that refactoring code clones improves software quality. By focusing on the evolutionary aspects of clones, this dissertation discovered that, in contrast to conventional wisdom, programmers often create and maintain code duplicates with clear intent and that immediate and aggressive refactoring may not be the best solution for managing code clones. The studies also contributed to developing the insight that a high-level change operation comprises systematic transformations at a code level and that identification of such systematicness can help programmers better understand code changes and avoid inconsistent updates.