Refactoring meets spreadsheet formulas

  • Authors:
  • Sandro Badame;Danny Dig

  • Affiliations:
  • University of Illinois;University of Illinois

  • Venue:
  • ICSM '12 Proceedings of the 2012 IEEE International Conference on Software Maintenance (ICSM)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The number of end-users who write spreadsheet programs is at least an order of magnitude larger than the number of trained programmers who write professional software. We studied a corpus of 3691 spreadsheets and we found that their formulas are riddled with the same smells that plague professional software: hardcoded constants, duplicated expressions, unnecessary complexity, and unsanitized input. These make spreadsheets difficult to read and expensive to maintain. Like automated refactoring in the object-oriented domain, spreadsheet refactoring can be transformative. In this paper we present seven refactorings for spreadsheet formulas implemented in RefBook, a plugin for Microsoft Excel. To evaluate the usefulness of RefBook, we employed three kinds of empirical methods. First, we conducted a User Survey with 28 Excel users to find out whether they preferred the refactored formulas. Second, we conducted a Controlled Experiment with the same 28 participants to measure their productivity when doing manual refactorings. Third, we performed a Retrospective Case Study on the EUSES Spreadsheet Corpus with 3691 spreadsheets to determine how often we could apply the refactorings supported by RefBook. The results show: (i) users prefer the improved quality of refactored formulas, (ii) RefBook is faster and more reliable than manual refactoring, and (iii) the refactorings are widely applicable. On average RefBook is able to apply the refactorings in less than half the time that users performed the refactorings manually. 92.54% of users introduced errors or new smells into the spreadsheet or were unable to complete the task.