Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
Journal of Computational and Applied Mathematics
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
OPTICS: ordering points to identify the clustering structure
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Service specific anomaly detection for network intrusion detection
Proceedings of the 2002 ACM symposium on Applied computing
On Clustering Validation Techniques
Journal of Intelligent Information Systems
Polygraph: Automatically Generating Signatures for Polymorphic Worms
SP '05 Proceedings of the 2005 IEEE Symposium on Security and Privacy
MisleadingWorm Signature Generators Using Deliberate Noise Injection
SP '06 Proceedings of the 2006 IEEE Symposium on Security and Privacy
Exploring Multiple Execution Paths for Malware Analysis
SP '07 Proceedings of the 2007 IEEE Symposium on Security and Privacy
Panorama: capturing system-wide information flow for malware detection and analysis
Proceedings of the 14th ACM conference on Computer and communications security
Mining specifications of malicious behavior
ISEC '08 Proceedings of the 1st India software engineering conference
Hash-AV: fast virus signature scanning by cache-resident filters
International Journal of Security and Networks
Learning and Classification of Malware Behavior
DIMVA '08 Proceedings of the 5th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
A Study of the Packer Problem and Its Solutions
RAID '08 Proceedings of the 11th international symposium on Recent Advances in Intrusion Detection
CloudAV: N-version antivirus in the network cloud
SS'08 Proceedings of the 17th conference on Security symposium
McPAD: A multiple classifier system for accurate payload-based anomaly detection
Computer Networks: The International Journal of Computer and Telecommunications Networking
Studying spamming botnets using Botlab
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Automated classification and analysis of internet malware
RAID'07 Proceedings of the 10th international conference on Recent advances in intrusion detection
Behavioral clustering of HTTP-based malware and signature generation using malicious network traces
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Effective and efficient malware detection at the end host
SSYM'09 Proceedings of the 18th conference on USENIX security symposium
Validity index for clusters of different sizes and densities
Pattern Recognition Letters
JACKSTRAWS: picking command and control connections from bot traffic
SEC'11 Proceedings of the 20th USENIX conference on Security
The power of procrastination: detection and mitigation of execution-stalling malicious code
Proceedings of the 18th ACM conference on Computer and communications security
BitShred: feature hashing malware for scalable triage and semantic analysis
Proceedings of the 18th ACM conference on Computer and communications security
Paragraph: thwarting signature learning by training maliciously
RAID'06 Proceedings of the 9th international conference on Recent Advances in Intrusion Detection
Anagram: a content anomaly detector resistant to mimicry attack
RAID'06 Proceedings of the 9th international conference on Recent Advances in Intrusion Detection
Some new indexes of cluster validity
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Computer Networks: The International Journal of Computer and Telecommunications Networking
Is data clustering in adversarial settings secure?
Proceedings of the 2013 ACM workshop on Artificial intelligence and security
ExecScent: mining for new C&C domains in live networks with adaptive control protocol templates
SEC'13 Proceedings of the 22nd USENIX conference on Security
Hi-index | 0.00 |
A large number of today's botnets leverage the HTTP protocol to communicate with their botmasters or perpetrate malicious activities. In this paper, we present a new scalable system for network-level behavioral clustering of HTTP-based malware that aims to efficiently group newly collected malware samples into malware family clusters. The end goal is to obtain malware clusters that can aid the automatic generation of high quality network signatures, which can in turn be used to detect botnet command-and-control (C&C) and other malware-generated communications at the network perimeter. We achieve scalability in our clustering system by simplifying the multi-step clustering process proposed in [31], and by leveraging incremental clustering algorithms that run efficiently on very large datasets. At the same time, we show that scalability is achieved while retaining a good trade-off between detection rate and false positives for the signatures derived from the obtained malware clusters. We implemented a proof-of-concept version of our new scalable malware clustering system and performed experiments with about 65,000 distinct malware samples. Results from our evaluation confirm the effectiveness of the proposed system and show that, compared to [31], our approach can reduce processing times from several hours to a few minutes, and scales well to large datasets containing tens of thousands of distinct malware samples.