Discriminant malware distance learning on structural information for automated malware classification

  • Authors:
  • Deguang Kong;Guanhua Yan

  • Affiliations:
  • University of Texas at Arlington, Arlington, TX, USA;Los Alamos National Lab, los alamos, NM, USA

  • Venue:
  • Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

The voluminous malware variants that appear in the Internet have posed severe threats to its security. In this work, we explore techniques that can automatically classify malware variants into their corresponding families. We present a generic framework that extracts structural information from malware programs as attributed function call graphs, in which rich malware features are encoded as attributes at the function level. Our framework further learns discriminant malware distance metrics that evaluate the similarity between the attributed function call graphs of two malware programs. To combine various types of malware attributes, our method adaptively learns the confidence level associated with the classification capability of each attribute type and then adopts an ensemble of classifiers for automated malware classification. We evaluate our approach with a number of Windows-based malware instances belonging to 11 families, and experimental results show that our automated malware classification method is able to achieve high classification accuracy.