Software Visualization in the Large

  • Authors:
  • Thomas Ball;Stephen G. Eick

  • Affiliations:
  • -;-

  • Venue:
  • Computer
  • Year:
  • 1996

Quantified Score

Hi-index 4.10

Visualization

Abstract

Production-sized systems, particularly legacy software, can contain millions of lines of code. Even a seemingly simple, small-team project, such as a spreadsheet, is quite complicated. Understanding, changing, and repairing code in large systems is especially time-consuming and costly. Knowledge of code decays as the software ages and the original programmers and design team move on to new assignments. The design documents are also usually out of date, leaving the code as the only guide to system behavior. It is tedious to reconstruct complex system behavior by analyzing code. Perhaps the most difficult software engineering projects involve "programming in the large." These large-team projects, often in maintenance mode, require enhancements involving subtle changes to complex legacy code written over many years. Under these circumstances, programmer productivity is low, changes are more likely to introduce errors, and software projects are often late. Software visualization can help software engineers cope with this complexity while increasing programmer productivity. Software is intangible, having no physical shape or size. After it is written, code "disappears" into files kept on disks. Software visualization tools use graphical techniques to make software visible by displaying programs, program artifacts, and program behavior. Pictures of the software can help slow knowledge decay by helping project members remember--and new members discover--how the code works. Three basic properties of software can be visualized: software structure (as in directed graphs); runtime behavior (as in algorithm animation); and the code itself (as in pretty printers). Previous approaches to software visualization, although useful for small projects, do not scale to the production-sized systems currently being manufactured. The graphical techniques found in programming, program-visualization, and algorithm-animation environments target small systems. Algorithm visualizations are usually hand-crafted and require the designer to understand the code before visualizing it, making this technique infeasible for large systems or tasks involving programmer discovery. The general strategy for large projects is to decompose the project into modules, usually hierarchically, and display each module individually. In practice, this decomposition is often the most difficult aspect of the visualization. When software is decomposed, the "big picture" is lost, often defeating the purpose of the visualization. To address these shortcomings, the authors developed scalable techniques for visualizing program text, text properties, and relationships involving program text, as text is the dominant medium for implementing large software systems. They have applied their tools to visualize code version history, differences between releases, static properties of code, code profiling and execution hot spots, and dynamic program slices. The systems presented are used daily within Bell Laboratories' development community, helping software developers work on the 5ESS product, a real-time switching system containing millions of lines of code developed over the past two decades by thousands of software engineers. The initial developer feedback has been very positive.