Fast hierarchical clustering and other applications of dynamic closest pairs
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Winnowing: local algorithms for document fingerprinting
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Fully automatic cross-associations
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
N-Gram-Based Detection of New Malicious Code
COMPSAC '04 Proceedings of the 28th Annual International Computer Software and Applications Conference - Workshops and Fast Abstracts - Volume 02
PolyUnpack: Automating the Hidden-Code Extraction of Unpack-Executing Malware
ACSAC '06 Proceedings of the 22nd Annual Computer Security Applications Conference
Learning to Detect and Classify Malicious Executables in the Wild
The Journal of Machine Learning Research
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Exploring Multiple Execution Paths for Malware Analysis
SP '07 Proceedings of the 2007 IEEE Symposium on Security and Privacy
Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Classification of packed executables for accurate computer virus detection
Pattern Recognition Letters
A Study of the Packer Problem and Its Solutions
RAID '08 Proceedings of the 11th international symposium on Recent Advances in Intrusion Detection
Ether: malware analysis via hardware virtualization extensions
Proceedings of the 15th ACM conference on Computer and communications security
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Feature hashing for large scale multitask learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Automatic Reverse Engineering of Malware Emulators
SP '09 Proceedings of the 2009 30th IEEE Symposium on Security and Privacy
Large-scale malware indexing using function-call graphs
Proceedings of the 16th ACM conference on Computer and communications security
Hash Kernels for Structured Data
The Journal of Machine Learning Research
Automated classification and analysis of internet malware
RAID'07 Proceedings of the 10th international conference on Recent advances in intrusion detection
Behavioral clustering of HTTP-based malware and signature generation using malicious network traces
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
On challenges in evaluating malware clustering
RAID'10 Proceedings of the 13th international conference on Recent advances in intrusion detection
Malware classification method via binary content comparison
Proceedings of the 2012 ACM Research in Applied Computation Symposium
Malware characterization using behavioral components
MMM-ACNS'12 Proceedings of the 6th international conference on Mathematical Methods, Models and Architectures for Computer Network Security: computer network security
CodeShield: towards personalized application whitelisting
Proceedings of the 28th Annual Computer Security Applications Conference
VAMO: towards a fully automated malware clustering validity analysis
Proceedings of the 28th Annual Computer Security Applications Conference
Lines of malicious code: insights into the malicious software industry
Proceedings of the 28th Annual Computer Security Applications Conference
Fast location of similar code fragments using semantic 'juice'
PPREW '13 Proceedings of the 2nd ACM SIGPLAN Program Protection and Reverse Engineering Workshop
Fast, scalable detection of "Piggybacked" mobile applications
Proceedings of the third ACM conference on Data and application security and privacy
Scalable fine-grained behavioral clustering of HTTP-based malware
Computer Networks: The International Journal of Computer and Telecommunications Networking
Locating executable fragments with Concordia, a scalable, semantics-based architecture
Proceedings of the Eighth Annual Cyber Security and Information Intelligence Research Workshop
MAST: triage for market-scale mobile malware analysis
Proceedings of the sixth ACM conference on Security and privacy in wireless and mobile networks
Juxtapp: a scalable system for detecting code reuse among android applications
DIMVA'12 Proceedings of the 9th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Tracking memory writes for malware classification and code reuse identification
DIMVA'12 Proceedings of the 9th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Pick your poison: pricing and inventories at unlicensed online pharmacies
Proceedings of the fourteenth ACM conference on Electronic commerce
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
VILO: a rapid learning nearest-neighbor classifier for malware triage
Journal in Computer Virology
DUET: integration of dynamic and static analyses for malware clustering with cluster ensembles
Proceedings of the 29th Annual Computer Security Applications Conference
SigMal: a static signal processing based malware triage
Proceedings of the 29th Annual Computer Security Applications Conference
Driving in the cloud: an analysis of drive-by download operations and abuse reporting
DIMVA'13 Proceedings of the 10th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Exploring discriminatory features for automated malware classification
DIMVA'13 Proceedings of the 10th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment
Towards automatic software lineage inference
SEC'13 Proceedings of the 22nd USENIX conference on Security
ExecScent: mining for new C&C domains in live networks with adaptive control protocol templates
SEC'13 Proceedings of the 22nd USENIX conference on Security
Revolver: an automated approach to the detection of evasiveweb-based malware
SEC'13 Proceedings of the 22nd USENIX conference on Security
MutantX-S: scalable malware clustering based on static features
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Systematic audit of third-party android phones
Proceedings of the 4th ACM conference on Data and application security and privacy
Hi-index | 0.00 |
The sheer volume of new malware found each day is growing at an exponential pace. This growth has created a need for automatic malware triage techniques that determine what malware is similar, what malware is unique, and why. In this paper, we present BitShred, a system for large-scale malware similarity analysis and clustering, and for automatically uncovering semantic inter- and intra-family relationships within clusters. The key idea behind BitShred is using feature hashing to dramatically reduce the high-dimensional feature spaces that are common in malware analysis. Feature hashing also allows us to mine correlated features between malware families and samples using co-clustering techniques. Our evaluation shows that BitShred speeds up typical malware triage tasks by up to 2,365x and uses up to 82x less memory on a single CPU, all with comparable accuracy to previous approaches. We also develop a parallelized version of BitShred, and demonstrate scalability within the Hadoop framework.