Matrix computations (3rd ed.)
Data Mining Methods for Detection of New Malicious Executables
SP '01 Proceedings of the 2001 IEEE Symposium on Security and Privacy
Static Analyzer of Vicious Executables (SAVE)
ACSAC '04 Proceedings of the 20th Annual Computer Security Applications Conference
Semantics-Aware Malware Detection
SP '05 Proceedings of the 2005 IEEE Symposium on Security and Privacy
Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions
FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Learning to Detect and Classify Malicious Executables in the Wild
The Journal of Machine Learning Research
Exploring Multiple Execution Paths for Malware Analysis
SP '07 Proceedings of the 2007 IEEE Symposium on Security and Privacy
Behavior-based spyware detection
USENIX-SS'06 Proceedings of the 15th conference on USENIX Security Symposium - Volume 15
Statistical signatures for fast filtering of instruction-substituting metamorphic malware
Proceedings of the 2007 ACM workshop on Recurring malcode
Introduction to Information Retrieval
Introduction to Information Retrieval
ACSAC '08 Proceedings of the 2008 Annual Computer Security Applications Conference
IMAD: in-execution malware analysis and detection
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Large-scale malware indexing using function-call graphs
Proceedings of the 16th ACM conference on Computer and communications security
Synthesizing Near-Optimal Malware Specifications from Suspicious Behaviors
SP '10 Proceedings of the 2010 IEEE Symposium on Security and Privacy
Combining file content and file relations for cloud based malware detection
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Malware classification based on call graph clustering
Journal in Computer Virology
New malicious code detection using variable length n-grams
ICISS'06 Proceedings of the Second international conference on Information Systems Security
Hi-index | 0.00 |
Typical malware classification methods analyze unknown files in isolation. However, this ignores valuable relationships between malware files, such as containment in a zip archive, dropping, or downloading. We present a new malware classification system based on a graph induced by file relationships, and, as a proof of concept, analyze containment relationships, for which we have much available data. However our methodology is general, relying only on an initial estimate for some of the files in our data and on propagating information along the edges of the graph. It can thus be applied to other types of file relationships. We show that since malicious files are often included in multiple malware containers, the system's detection accuracy can be significantly improved, particularly at low false positive rates which are the main operating points for automated malware classifiers. For example at a false positive rate of 0.2%, the false negative rate decreases from 42.1% to 15.2%. Finally, the new system is highly scalable; our basic implementation can learn good classifiers from a large, bipartite graph including over 719 thousand containers and 3.4 million files in a total of 16 minutes.