Introduction to algorithms
gSpan: Graph-Based Substructure Pattern Mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
GraphRank: Statistical Modeling and Mining of Significant Subgraphs in the Feature Space
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Toward Automated Dynamic Malware Analysis Using CWSandbox
IEEE Security and Privacy
Mining specifications of malicious behavior
ISEC '08 Proceedings of the 1st India software engineering conference
Mining significant graph patterns by leap search
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Characterizing Bots' Remote Control Behavior
DIMVA '07 Proceedings of the 4th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
A Layered Architecture for Detecting Malicious Behaviors
RAID '08 Proceedings of the 11th international symposium on Recent Advances in Intrusion Detection
LIBLINEAR: A Library for Large Linear Classification
The Journal of Machine Learning Research
GraphSig: A Scalable Approach to Mining Significant Subgraphs in Large Graph Databases
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Synthesizing Near-Optimal Malware Specifications from Suspicious Behaviors
SP '10 Proceedings of the 2010 IEEE Symposium on Security and Privacy
Identifying Dormant Functionality in Malware Programs
SP '10 Proceedings of the 2010 IEEE Symposium on Security and Privacy
A view on current malware behaviors
LEET'09 Proceedings of the 2nd USENIX conference on Large-scale exploits and emergent threats: botnets, spyware, worms, and more
Malware analysis with tree automata inference
CAV'11 Proceedings of the 23rd international conference on Computer aided verification
Graph-based malware detection using dynamic analysis
Journal in Computer Virology
Hi-index | 0.00 |
Traditionally, analysis of malicious software is only a semi-automated process, often requiring a skilled human analyst. As new malware appears at an increasingly alarming rate --- now over 100 thousand new variants each day --- there is a need for automated techniques for identifying suspicious behavior in programs. In this paper, we propose a method for extracting statistically significant malicious behaviors from a system call dependency graph (obtained by running a binary executable in a sandbox). Our approach is based on a new method for measuring the statistical significance of subgraphs. Given a training set of graphs from two classes (e.g., goodware and malware system call dependency graphs), our method can assign p-values to subgraphs of new graph instances even if those subgraphs have not appeared before in the training data (thus possibly capturing new behaviors or disguised versions of existing behaviors).