Unity in Diversity: Phylogenetic-inspired Techniques for Reverse Engineering and Detection of Malware Families

  • Authors:
  • Wei Ming Khoo;Pietro Lio

  • Affiliations:
  • -;-

  • Venue:
  • SYSSEC '11 Proceedings of the 2011 First SysSec Workshop
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We developed a framework for abstracting, aligning and analysing malware execution traces and performed a preliminary exploration of state of the art phylogenetic methods, whose strengths lie in pattern recognition and visualisation, to derive the statistical relationships within two contemporary malware families. We made use of phylogenetic trees and networks, motifs, logos, composition biases, and tree topology comparison methods with the objective of identifying common functionality and studying sources of variation in related samples. Networks were more useful for visualising short nop-equivalent code metamorphism than trees, tree topology comparison was suited for studying variations in multiple sets of homologous procedures. We found logos could be used for code normalisation, which resulted in 33% to 62% reduction in the number of instructions. A motif search showed that API sequences related to the management of memory, I/O, libraries and threading do not change significantly amongst malware variants, composition bias provided an efficient way to distinguish between families. Using context-sensitive procedure analysis, we found that 100% of a set of memory management procedures used by the FakeAV-DO and "Skyhoo" malware families were uniquely identifiable. We discuss how phylogenetic techniques can aid the reverse engineering and detection of malware families and describe some related challenges.