Malware analysis with tree automata inference

  • Authors:
  • Domagoj Babić;Daniel Reynaud;Dawn Song

  • Affiliations:
  • University of California, Berkeley;University of California, Berkeley;University of California, Berkeley

  • Venue:
  • CAV'11 Proceedings of the 23rd international conference on Computer aided verification
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The underground malware-based economy is flourishing and it is evident that the classical ad-hoc signature detection methods are becoming insufficient. Malware authors seem to share some source code and malware samples often feature similar behaviors, but such commonalities are difficult to detect with signature-based methods because of an increasing use of numerous freelyavailable randomized obfuscation tools. To address this problem, the security community is actively researching behavioral detection methods that commonly attempt to understand and differentiate how malware behaves, as opposed to just detecting syntactic patterns. We continue that line of research in this paper and explore how formal methods and tools of the verification trade could be used for malware detection and analysis. We propose a new approach to learning and generalizing from observed malware behaviors based on tree automata inference. In particular, we develop an algorithm for inferring k-testable tree automata from system call dataflow dependency graphs and discuss the use of inferred automata in malware recognition and classification.