Have things changed now?: an empirical study of bug characteristics in modern open source software

  • Authors:
  • Zhenmin Li;Lin Tan;Xuanhui Wang;Shan Lu;Yuanyuan Zhou;Chengxiang Zhai

  • Affiliations:
  • University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL

  • Venue:
  • Proceedings of the 1st workshop on Architectural and system support for improving software dependability
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Software errors are a major cause for system failures. To effectively design tools and support for detecting and recovering from software failures requires a deep understanding of bug characteristics. Recently, software and its development process have significantly changed in many ways, including more help from bug detection tools, shift towards multi-threading architecture, the open-source development paradigm and increasing concerns about security and user-friendly interface. Therefore, results from previous studies may not be applicable to present software. Furthermore, many new aspects such as security, concurrency and open-source-related characteristics have not well studied. Additionally, previous studies were based on a small number of bugs, which may lead to non-representative results.To investigate the impacts of the new factors on software errors, we analyze bug characteristics by first sampling hundreds of real world bugs in two large, representative open-source projects. To validate the representativeness of our results, we use natural language text classification techniques and automatically analyze around 29, 000 bugs from the Bugzilla databases of the software.Our study has discovered several new interesting characteristics: (1) memory-related bugs have decreased because quite a few effective detection tools became available recently; (2) surprisingly, some simple memory-related bugs such as NULL pointer dereferences that should have been detected by existing tools in development are still a major component, which indicates that the tools have not been used with their full capacity; (3) semantic bugs are the dominant root causes, as they are application specific and difficult to fix, which suggests that more efforts should be put into detecting and fixing them; (4) security bugs are increasing, and the majority of them cause severe impacts.