Combining file content and file relations for cloud based malware detection

  • Authors:
  • Yanfang Ye;Tao Li;Shenghuo Zhu;Weiwei Zhuang;Egemen Tas;Umesh Gupta;Melih Abdulhayoglu

  • Affiliations:
  • Comodo Security Solutions, Inc,, Beijing, China;Florida International University, Miami, FL, USA;NEC Laboratories America, Cupertino, CA, USA;Xiamen University, Xiamen, China;Comodo Security Solutions, Inc, New Jersey, NJ, USA;Comodo Security Solutions, Inc, New Jersey, NJ, USA;Comodo Security Solutions, Inc, New Jersey, NJ, USA

  • Venue:
  • Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to their damages to Internet security, malware (such as virus, worms, trojans, spyware, backdoors, and rootkits) detection has caught the attention not only of anti-malware industry but also of researchers for decades. Resting on the analysis of file contents extracted from the file samples, like Application Programming Interface (API) calls, instruction sequences, and binary strings, data mining methods such as Naive Bayes and Support Vector Machines have been used for malware detection. However, besides file contents, relations among file samples, such as a "Downloader" is always associated with many Trojans, can provide invaluable information about the properties of file samples. In this paper, we study how file relations can be used to improve malware detection results and develop a file verdict system (named "Valkyrie") building on a semi-parametric classifier model to combine file content and file relations together for malware detection. To the best of our knowledge, this is the first work of using both file content and file relations for malware detection. A comprehensive experimental study on a large collection of PE files obtained from the clients of anti-malware products of Comodo Security Solutions Incorporation is performed to compare various malware detection approaches. Promising experimental results demonstrate that the accuracy and efficiency of our Valkyrie system outperform other popular anti-malware software tools such as Kaspersky AntiVirus and McAfee VirusScan, as well as other alternative data mining based detection systems.