ARCHER: using symbolic, path-sensitive analysis to detect memory access errors

  • Authors:
  • Yichen Xie;Andy Chou;Dawson Engler

  • Affiliations:
  • Stanford University, Stanford, CA;Stanford University, Stanford, CA;Stanford University, Stanford, CA

  • Venue:
  • Proceedings of the 9th European software engineering conference held jointly with 11th ACM SIGSOFT international symposium on Foundations of software engineering
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Memory corruption errors lead to non-deterministic, elusive crashes. This paper describes ARCHER (ARray CHeckER) a static, effective memory access checker. ARCHER uses path-sensitive, interprocedural symbolic analysis to bound the values of both variables and memory sizes. It evaluates known values using a constraint solver at every array access, pointer dereference, or call to a function that expects a size parameter. Accesses that violate constraints are flagged as errors. Those that are exploitable by malicious attackers are marked as security holes.Memory corruption errors lead to non-deterministic, elusive crashes. This paper describes ARCHER (ARray CHeckER) a static, effective memory access checker. ARCHER uses path-sensitive, interprocedural symbolic analysis to bound the values of both variables and memory sizes. It evaluates known values using a constraint solver at every array access, pointer dereference, or call to a function that expects a size parameter. Accesses that violate constraints are flagged as errors. Those that are exploitable by malicious attackers are marked as security holes.We carefully designed ARCHER to work well on large bodies of source code. It requires no annotations to use (though it can use them). Its solver has been built to be powerful in the ways that real code requires, while backing off on the places that were irrelevant. Selective power allows it to gain efficiency while avoiding classes of false positives that arise when a complex analysis interacts badly with statically undecidable program properties. ARCHER uses statistical code analysis to automatically infer the set of functions that it should track --- this inference serves as a robust guard against omissions, especially in large systems which can have hundreds of such functions.In practice ARCHER is effective: it finds many errors; its analysis scales to systems of millions of lines of code and the average false positive rate of our results is below 35%. We have run ARCHER over several large open source software projects --- such as Linux, OpenBSD, Sendmail, and PostgreSQL --- and have found errors in all of them (118 in the case of Linux, including 21 security holes).