Vulnerability extrapolation: assisted discovery of vulnerabilities using machine learning

  • Authors:
  • Fabian Yamaguchi;Felix Lindner;Konrad Rieck

  • Affiliations:
  • Recurity Labs GmbH, Germany;Recurity Labs GmbH, Germany;Technische Universität Berlin, Germany

  • Venue:
  • WOOT'11 Proceedings of the 5th USENIX conference on Offensive technologies
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Rigorous identification of vulnerabilities in program code is a key to implementing and operating secure systems. Unfortunately, only some types of vulnerabilities can be detected automatically. While techniques from software testing can accelerate the search for security flaws, in the general case discovery of vulnerabilities is a tedious process that requires significant expertise and time. In this paper, we propose a method for assisted discovery of vulnerabilities in source code. Our method proceeds by embedding code in a vector space and automatically determining API usage patterns using machine learning. Starting from a known vulnerability, these patterns can be exploited to guide the auditing of code and to identify potentially vulnerable code with similar characteristics--a process we refer to as vulnerability extrapolation. We empirically demonstrate the capabilities of our method in different experiments. In a case study with the library FFmpeg, we are able to narrowthe search for interesting code from 6,778 to 20 functions and discover two security flaws, one being a known flaw and the other constituting a zero-day vulnerability.