A quantitative study of accuracy in system call-based malware detection

Authors:
Davide Canali;Andrea Lanzi;Davide Balzarotti;Christopher Kruegel;Mihai Christodorescu;Engin Kirda
Affiliations:
EURECOM, France;EURECOM, France;EURECOM, France;UC Santa Barbara, USA;IBM Research, USA;Northeastern University, USA
Venue:
Proceedings of the 2012 International Symposium on Software Testing and Analysis
Year:
2012

Citing 16
Cited 3

Windows NT/2000 Native API Reference

Windows NT/2000 Native API Reference
Data Mining Methods for Detection of New Malicious Executables

SP '01 Proceedings of the 2001 IEEE Symposium on Security and Privacy
Testing malware detectors

ISSTA '04 Proceedings of the 2004 ACM SIGSOFT international symposium on Software testing and analysis
Static Analyzer of Vicious Executables (SAVE)

ACSAC '04 Proceedings of the 20th Annual Computer Security Applications Conference
Polymorphic Malicious Executable Scanner by API Sequence Analysis

HIS '04 Proceedings of the Fourth International Conference on Hybrid Intelligent Systems
Semantics-Aware Malware Detection

SP '05 Proceedings of the 2005 IEEE Symposium on Security and Privacy
Behavior-based spyware detection

USENIX-SS'06 Proceedings of the 15th conference on USENIX Security Symposium - Volume 15
Mining specifications of malicious behavior

Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
Panorama: capturing system-wide information flow for malware detection and analysis

Proceedings of the 14th ACM conference on Computer and communications security
Characterizing Bots' Remote Control Behavior

DIMVA '07 Proceedings of the 4th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
A Layered Architecture for Detecting Malicious Behaviors

RAID '08 Proceedings of the 11th international symposium on Recent Advances in Intrusion Detection
A view on current malware behaviors

LEET'09 Proceedings of the 2nd USENIX conference on Large-scale exploits and emergent threats: botnets, spyware, worms, and more
Effective and efficient malware detection at the end host

SSYM'09 Proceedings of the 18th conference on USENIX security symposium
Detecting self-mutating malware using control-flow graph matching

DIMVA'06 Proceedings of the Third international conference on Detection of Intrusions and Malware & Vulnerability Assessment
Detecting malicious code by model checking

DIMVA'05 Proceedings of the Second international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Polymorphic worm detection using structural information of executables

RAID'05 Proceedings of the 8th international conference on Recent Advances in Intrusion Detection

Vetting undesirable behaviors in android apps with permission use analysis

Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security
Exploring discriminatory features for automated malware classification

DIMVA'13 Proceedings of the 10th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
PREC: practical root exploit containment for android devices

Proceedings of the 4th ACM conference on Data and application security and privacy

Quantified Score

Hi-index	0.00

Visualization

Abstract

Over the last decade, there has been a significant increase in the number and sophistication of malware-related attacks and infections. Many detection techniques have been proposed to mitigate the malware threat. A running theme among existing detection techniques is the similar promises of high detection rates, in spite of the wildly different models (or specification classes) of malicious activity used. In addition, the lack of a common testing methodology and the limited datasets used in the experiments make difficult to compare these models in order to determine which ones yield the best detection accuracy. In this paper, we present a systematic approach to measure how the choice of behavioral models influences the quality of a malware detector. We tackle this problem by executing a large number of testing experiments, in which we explored the parameter space of over 200 different models, corresponding to more than 220 million of signatures. Our results suggest that commonly held beliefs about simple models are incorrect in how they relate changes in complexity to changes in detection accuracy. This implies that accuracy is non-linear across the model space, and that analytical reasoning is insufficient for finding an optimal model, and has to be supplemented by testing and empirical measurements.