Operating system protection through program evolution
Computers and Security
Exploring the similarity space
ACM SIGIR Forum
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Methods for identifying versioned and plagiarized documents
Journal of the American Society for Information Science and Technology
Data Mining Methods for Detection of New Malicious Executables
SP '01 Proceedings of the 2001 IEEE Symposium on Security and Privacy
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Learning to detect malicious executables in the wild
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
N-Gram-Based Detection of New Malicious Code
COMPSAC '04 Proceedings of the 28th Annual International Computer Software and Applications Conference - Workshops and Fast Abstracts - Volume 02
Semantics-Aware Malware Detection
SP '05 Proceedings of the 2005 IEEE Symposium on Security and Privacy
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Program element matching for multi-version program analyses
Proceedings of the 2006 international workshop on Mining software repositories
Using engine signature to detect metamorphic malware
Proceedings of the 4th ACM workshop on Recurring malcode
Learning to Detect and Classify Malicious Executables in the Wild
The Journal of Machine Learning Research
Statistical signatures for fast filtering of instruction-substituting metamorphic malware
Proceedings of the 2007 ACM workshop on Recurring malcode
Unknown Malcode Detection Using OPCODE Representation
EuroISI '08 Proceedings of the 1st European Conference on Intelligence and Security Informatics
Sensitivity Analysis of k-Fold Cross Validation in Prediction Error Estimation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Automated classification and analysis of internet malware
RAID'07 Proceedings of the 10th international conference on Recent advances in intrusion detection
Hunting for undetectable metamorphic viruses
Journal in Computer Virology
BitShred: feature hashing malware for scalable triage and semantic analysis
Proceedings of the 18th ACM conference on Computer and communications security
Malware classification based on call graph clustering
Journal in Computer Virology
Polymorphic worm detection using structural information of executables
RAID'05 Proceedings of the 8th international conference on Recent Advances in Intrusion Detection
Opcode graph similarity and metamorphic detection
Journal in Computer Virology
Chi-squared distance and metamorphic virus detection
Journal in Computer Virology
Hi-index | 0.00 |
VILO is a lazy learner system designed for malware classification and triage. It implements a nearest neighbor (NN) algorithm with similarities computed over Term Frequency $$\times $$ Inverse Document Frequency (TFIDF) weighted opcode mnemonic permutation features (N-perms). Being an NN-classifier, VILO makes minimal structural assumptions about class boundaries, and thus is well suited for the constantly changing malware population. This paper presents an extensive study of application of VILO in malware analysis. Our experiments demonstrate that (a) VILO is a rapid learner of malware families, i.e., VILO's learning curve stabilizes at high accuracies quickly (training on less than 20 variants per family is sufficient); (b) similarity scores derived from TDIDF weighted features should primarily be treated as ordinal measurements; and (c) VILO with N-perm feature vectors outperforms traditional N-gram feature vectors when used to classify real-world malware into their respective families.