Semantics-Aware Malware Detection
SP '05 Proceedings of the 2005 IEEE Symposium on Security and Privacy
Novel Hybrid Hierarchical-K-means Clustering Method (H-K-means) for Microarray Analysis
CSBW '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference - Workshops
IMDS: intelligent malware detection system
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
BIBM '08 Proceedings of the 2008 IEEE International Conference on Bioinformatics and Biomedicine
Automated classification and analysis of internet malware
RAID'07 Proceedings of the 10th international conference on Recent advances in intrusion detection
Text categorization with class-based and corpus-based keyword selection
ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
Survey of clustering algorithms
IEEE Transactions on Neural Networks
Stock price movement prediction using representative prototypes of financial reports
ACM Transactions on Management Information Systems (TMIS)
Hi-index | 0.00 |
Nowadays, numerous attacks made by the malware, such as viruses, backdoors, spyware, trojans and worms, have presented a major security threat to computer users. The most significant line of defense against malware is anti-virus products which detects, removes, and characterizes these threats. The ability of these AV products to successfully characterize these threats greatly depends on the method for categorizing these profiles of malware into groups. Therefore, clustering malware into different families is one of the computer security topics that are of great interest. In this paper, resting on the analysis of the extracted instruction of malware samples, we propose a novel parameter-free hybrid clustering algorithm (PFHC) which combines the merits of hierarchical clustering and K-means algorithms for malware clustering. It can not only generate stable initial division, but also give the best K. PFHC first utilizes agglomerative hierarchical clustering algorithm as the frame, starting with N singleton clusters, each of which exactly includes one sample, then reuses the centroids of upper level in every level and merges the two nearest clusters, finally adopts K-means algorithm for iteration to achieve an approximate global optimal division. PFHC evaluates clustering validity of each iteration procedure and generates the best K by comparing the values. The promising studies on real daily data collection illustrate that, compared with popular existing K-means and hierarchical clustering approaches, our proposed PFHC algorithm always generates much higher quality clusters and it can be well used for malware categorization.