SigMal: a static signal processing based malware triage

Authors:
Dhilung Kirat;Lakshmanan Nataraj;Giovanni Vigna;B. S. Manjunath
Affiliations:
University of California, Santa Barbara;University of California, Santa Barbara;University of California, Santa Barbara;University of California, Santa Barbara
Venue:
Proceedings of the 29th Annual Computer Security Applications Conference
Year:
2013

Citing 23
Cited 0

Texture Features for Browsing and Retrieval of Image Data

IEEE Transactions on Pattern Analysis and Machine Intelligence
Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope

International Journal of Computer Vision
Introduction to MPEG-7: Multimedia Content Description Interface

Introduction to MPEG-7: Multimedia Content Description Interface
Data Mining Methods for Detection of New Malicious Executables

SP '01 Proceedings of the 2001 IEEE Symposium on Security and Privacy
Context-based vision system for place and object recognition

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
N-Gram-Based Detection of New Malicious Code

COMPSAC '04 Proceedings of the 28th Annual International Computer Software and Applications Conference - Workshops and Fast Abstracts - Volume 02
Learning to Detect and Classify Malicious Executables in the Wild

The Journal of Machine Learning Research
Detecting Obfuscated Viruses Using Cosine Similarity Analysis

AMS '07 Proceedings of the First Asia International Conference on Modelling & Simulation
Classification of packed executables for accurate computer virus detection

Pattern Recognition Letters
McBoost: Boosting Scalability in Malware Collection and Analysis Using Statistical Classification of Executables

ACSAC '08 Proceedings of the 2008 Annual Computer Security Applications Conference
Malware detection using statistical analysis of byte-level file content

Proceedings of the ACM SIGKDD Workshop on CyberSecurity and Intelligence Informatics
Evaluation of GIST descriptors for web-scale image search

Proceedings of the ACM International Conference on Image and Video Retrieval
Large-scale malware indexing using function-call graphs

Proceedings of the 16th ACM conference on Computer and communications security
PE-Miner: Mining Structural Information to Detect Malicious Executables in Realtime

RAID '09 Proceedings of the 12th International Symposium on Recent Advances in Intrusion Detection
peHash: a novel approach to fast malware clustering

LEET'09 Proceedings of the 2nd USENIX conference on Large-scale exploits and emergent threats: botnets, spyware, worms, and more
Behavioral clustering of HTTP-based malware and signature generation using malicious network traces

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
On challenges in evaluating malware clustering

RAID'10 Proceedings of the 13th international conference on Recent advances in intrusion detection
Malware images: visualization and automatic classification

Proceedings of the 8th International Symposium on Visualization for Cyber Security
BitShred: feature hashing malware for scalable triage and semantic analysis

Proceedings of the 18th ACM conference on Computer and communications security
Polymorphic worm detection using structural information of executables

RAID'05 Proceedings of the 8th international conference on Recent Advances in Intrusion Detection
Identifying almost identical files using context triggered piecewise hashing

Digital Investigation: The International Journal of Digital Forensics & Incident Response
Prudent Practices for Designing Malware Experiments: Status Quo and Outlook

SP '12 Proceedings of the 2012 IEEE Symposium on Security and Privacy
A static, packer-agnostic filter to detect similar malware samples

DIMVA'12 Proceedings of the 9th international conference on Detection of Intrusions and Malware, and Vulnerability Assessment

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work, we propose SigMal, a fast and precise malware detection framework based on signal processing techniques. SigMal is designed to operate with systems that process large amounts of binary samples. It has been observed that many samples received by such systems are variants of previously-seen malware, and they retain some similarity at the binary level. Previous systems used this notion of malware similarity to detect new variants of previously-seen malware. SigMal improves the state-of-the-art by leveraging techniques borrowed from signal processing to extract noise-resistant similarity signatures from the samples. SigMal uses an efficient nearest-neighbor search technique, which is scalable to millions of samples. We evaluate SigMal on 1.2 million recent samples, both packed and unpacked, observed over a duration of three months. In addition, we also used a constant dataset of known benign executables. Our results show that SigMal can classify 50% of the recent incoming samples with above 99% precision. We also show that SigMal could have detected, on average, 70 malware samples per day before any antivirus vendor detected them.