Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
Journal of Computational and Applied Mathematics
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
The Art of Computer Virus Research and Defense
The Art of Computer Virus Research and Defense
A Binary Linear Programming Formulation of the Graph Edit Distance
IEEE Transactions on Pattern Analysis and Machine Intelligence
Multiple Graph Alignment for the Structural Analysis of Protein Active Sites
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Toward Automated Dynamic Malware Analysis Using CWSandbox
IEEE Security and Privacy
Code Normalization for Self-Mutating Malware
IEEE Security and Privacy
k-means++: the advantages of careful seeding
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Constructing the Call Graph of a Program
IEEE Transactions on Software Engineering
Opcodes as predictor for malware
International Journal of Electronic Security and Digital Forensics
Image categorization: Graph edit distance+edge direction histogram
Pattern Recognition
Introduction to Information Retrieval
Introduction to Information Retrieval
Approximate graph edit distance computation by means of bipartite graph matching
Image and Vision Computing
Large-scale malware indexing using function-call graphs
Proceedings of the 16th ACM conference on Computer and communications security
Comparing stars: on approximating graph edit distance
Proceedings of the VLDB Endowment
Bipartite graph matching for computing the edit distance of graphs
GbRPR'07 Proceedings of the 6th IAPR-TC-15 international conference on Graph-based representations in pattern recognition
Improving the efficiency of dynamic malware analysis
Proceedings of the 2010 ACM Symposium on Applied Computing
Automated classification and analysis of internet malware
RAID'07 Proceedings of the 10th international conference on Recent advances in intrusion detection
Improved call graph comparison using simulated annealing
Proceedings of the 2011 ACM Symposium on Applied Computing
Fast suboptimal algorithms for the computation of graph edit distance
SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Mining control flow graph as API call-grams to detect portable executable malware
Proceedings of the Fifth International Conference on Security of Information and Networks
A similarity metric method of obfuscated malware using function-call graph
Journal in Computer Virology
Using file relationships in malware classification
DIMVA'12 Proceedings of the 9th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Tracking memory writes for malware classification and code reuse identification
DIMVA'12 Proceedings of the 9th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Unveiling Zeus: automated classification of malware samples
Proceedings of the 22nd international conference on World Wide Web companion
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
VILO: a rapid learning nearest-neighbor classifier for malware triage
Journal in Computer Virology
Malware analysis method using visualization of binary files
Proceedings of the 2013 Research in Adaptive and Convergent Systems
Function matching-based binary-level software similarity calculation
Proceedings of the 2013 Research in Adaptive and Convergent Systems
Structural detection of android malware using embedded call graphs
Proceedings of the 2013 ACM workshop on Artificial intelligence and security
DroidLegacy: Automated Familial Classification of Android Malware
Proceedings of ACM SIGPLAN on Program Protection and Reverse Engineering Workshop 2014
Hi-index | 0.00 |
Each day, anti-virus companies receive tens of thousands samples of potentially harmful executables. Many of the malicious samples are variations of previously encountered malware, created by their authors to evade pattern-based detection. Dealing with these large amounts of data requires robust, automatic detection approaches. This paper studies malware classification based on call graph clustering. By representing malware samples as call graphs, it is possible to abstract certain variations away, enabling the detection of structural similarities between samples. The ability to cluster similar samples together will make more generic detection techniques possible, thereby targeting the commonalities of the samples within a cluster. To compare call graphs mutually, we compute pairwise graph similarity scores via graph matchings which approximately minimize the graph edit distance. Next, to facilitate the discovery of similar malware samples, we employ several clustering algorithms, including k-medoids and Density-Based Spatial Clustering of Applications with Noise (DBSCAN). Clustering experiments are conducted on a collection of real malware samples, and the results are evaluated against manual classifications provided by human malware analysts. Experiments show that it is indeed possible to accurately detect malware families via call graph clustering. We anticipate that in the future, call graphs can be used to analyse the emergence of new malware families, and ultimately to automate implementation of generic detection schemes.