Malware detection using assembly and API call sequences

  • Authors:
  • Madhu K. Shankarapani;Subbu Ramamoorthy;Ram S. Movva;Srinivas Mukkamala

  • Affiliations:
  • Department of Computer Science, Institute for Complex Additive Systems Analysis, Computational Analysis and Network Enterprise Solutions, New Mexico Tech, Socorro, USA 87801;Cyber Security Works, New Mexico Tech, Socorro, USA 87801;Cyber Security Works, New Mexico Tech, Socorro, USA 87801;Department of Computer Science, Institute for Complex Additive Systems Analysis, Computational Analysis and Network Enterprise Solutions, New Mexico Tech, Socorro, USA 87801

  • Venue:
  • Journal in Computer Virology
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

One of the major problems concerning information assurance is malicious code. To evade detection, malware has also been encrypted or obfuscated to produce variants that continue to plague properly defended and patched networks with zero day exploits. With malware and malware authors using obfuscation techniques to generate automated polymorphic and metamorphic versions, anti-virus software must always keep up with their samples and create a signature that can recognize the new variants. Creating a signature for each variant in a timely fashion is a problem that anti-virus companies face all the time. In this paper we present detection algorithms that can help the anti-virus community to ensure a variant of a known malware can still be detected without the need of creating a signature; a similarity analysis (based on specific quantitative measures) is performed to produce a matrix of similarity scores that can be utilized to determine the likelihood that a piece of code under inspection contains a particular malware. Two general malware detection methods presented in this paper are: Static Analyzer for Vicious Executables (SAVE) and Malware Examiner using Disassembled Code (MEDiC). MEDiC uses assembly calls for analysis and SAVE uses API calls (Static API call sequence and Static API call set) for analysis. We show where Assembly can be superior to API calls in that it allows a more detailed comparison of executables. API calls, on the other hand, can be superior to Assembly for its speed and its smaller signature. Our two proposed techniques are implemented in SAVE) and MEDiC. We present experimental results that indicate that both of our proposed techniques can provide a better detection performance against obfuscated malware. We also found a few false positives, such as those programs that use network functions (e.g. PuTTY) and encrypted programs (no API calls or assembly functions are found in the source code) when the thresholds are set 50% similarity measure. However, these false positives can be minimized, for example by changing the threshold value to 70% that determines whether a program falls in the malicious category or not.