Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Constructing Suffix Trees On-Line in Linear Time
Proceedings of the IFIP 12th World Computer Congress on Algorithms, Software, Architecture - Information Processing '92, Volume 1 - Volume I
Defining clusters from a hierarchical cluster tree
Bioinformatics
Automatic malware categorization using cluster ensemble
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
On challenges in evaluating malware clustering
RAID'10 Proceedings of the 13th international conference on Recent advances in intrusion detection
Automatic analysis of malware behavior using machine learning
Journal of Computer Security
BitShred: feature hashing malware for scalable triage and semantic analysis
Proceedings of the 18th ACM conference on Computer and communications security
A survey on automated dynamic malware-analysis techniques and tools
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
Over the past years, we have experienced an increase in the quantity and complexity of malware binaries. This change has been fueled by the introduction of malware generation tools and reuse of different malcode modules. Recent malware appears to be highly modular and less functionally typified. A side-effect of this "composition" of components across different malware types, a growing number of new malware samples cannot be explicitly assigned to traditional classes defined by Anti-Virus (AV) vendors. Indeed, by nature, clustering techniques capture dominant behavior that could be a manifestation of only one of the malware component failing to reveal malware similarities that depend on other, less dominant components and other evolutionary traits. In this paper, we introduce a novel malware behavioral commonality analysis scheme that takes into consideration component-wise grouping, called behavioral mapping. Our effort attempts to shed light to malware behavioral relationships and go beyond simply clustering the malware into a family. To this end, we implemented a method for identifying soft clusters and reveal shared malware components and traits. Using our method, we demonstrate that a malware sample can belong to several groups (clusters), implying sharing of its respective components with other samples from the groups. We performed experiments with a large corpus of real-world malware data-sets and identified that we can successfully highlight malware component relationships across the existing AV malware families and variants.