First-aid: surviving and preventing memory management bugs during production runs

  • Authors:
  • Qi Gao;Wenbin Zhang;Yan Tang;Feng Qin

  • Affiliations:
  • Ohio State University, Columbus, OH, USA;Ohio State University, Columbus, OH, USA;Ohio State University, Columbus, OH, USA;Ohio State University, Columbus, OH, USA

  • Venue:
  • Proceedings of the 4th ACM European conference on Computer systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Memory bugs in C/C++ programs severely affect system availability and security. This paper presents First-Aid, a lightweight runtime system that survives software failures caused by common memory management bugs and prevents future failures by the same bugs during production runs. Upon a failure, First-Aid diagnoses the bug type and identifies the memory objects that trigger the bug. To do so, it rolls back the programto previous checkpoints and uses two types of environmental changes that can prevent or expose memory bug manifestation during re-execution. Based on the diagnosis, First-Aid generates and applies runtime patches to avoid the memory bug and prevent its reoccurrence. Furthermore, First-Aid validates the consistent effects of the runtime patches and generates on-site diagnostic reports to assist developers in fixing the bugs. We have implemented First-Aid on Linux and evaluated it with seven applications that contain various types of memory bugs, including buffer overflow, uninitialized read, dangling pointer read/write, and double free. The results show that First-Aid can quickly diagnose the tested bugs and recover applications from failures (in 0.084 to 3.978 seconds). The results also show that the runtime patches generated by First-Aid can prevent future failures caused by the diagnosed bugs. Additionally, First-Aid provides detailed diagnostic information on both the root cause and the manifestation of the bugs. Furthermore, First-Aid incurs low overhead (0.4-11.6% with an average of 3.7%) during normal execution for the tested buggy applications, SPEC INT2000, and four allocation intensive programs.