Software error early detection system based on run-time statistical analysis of function return values

  • Authors:
  • Alex Depoutovitch;Michael Stumm

  • Affiliations:
  • Dept. of Computer Science and Dept. of Electrical and Computer Engineering, University of Toronto;Dept. of Computer Science and Dept. of Electrical and Computer Engineering, University of Toronto

  • Venue:
  • HotACI'06 Proceedings of the First international conference on Hot topics in autonomic computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large software systems are extremely complex and based on code that is constantly changing with bug fixes and new features. As a result, these systems will likely never be free of bugs. The bugs typically don't expose themselves until they are triggered by a new workload, and when triggered, they are rarely immediately fatal, but result in a system that continues to run with corrupt internal state, deteriorating over time to the point where it becomes inoperable. Having a method to identify corrupt state early would allow the initiation of defensive actions such as flushing page caches or redirecting external requests to another service in the cluster. In this paper, we propose a statistical method of detecting problems in software at run-time based on analyzing function return values. The methodology, at this time, requires the availability of source code, but does not require understanding the source code. Our experimental results indicate that our method can be effective in identifying problems early on, potentially allowing for defensive measures. The overhead is negligible at less than 1%.