Machine Learning
Data Mining Methods for Detection of New Malicious Executables
SP '01 Proceedings of the 2001 IEEE Symposium on Security and Privacy
Detection of injected, dynamically generated, and obfuscated malicious code
Proceedings of the 2003 ACM workshop on Rapid malcode
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Learning to detect malicious executables in the wild
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Static Analyzer of Vicious Executables (SAVE)
ACSAC '04 Proceedings of the 20th Annual Computer Security Applications Conference
IEEE Transactions on Pattern Analysis and Machine Intelligence
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Static analysis of executables to detect malicious patterns
SSYM'03 Proceedings of the 12th conference on USENIX Security Symposium - Volume 12
Classification of software behaviors for failure detection: a discriminative pattern mining approach
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
An improvement for fast-flux service networks detection based on data mining techniques
RSFDGrC'11 Proceedings of the 13th international conference on Rough sets, fuzzy sets, data mining and granular computing
Mining control flow graph as API call-grams to detect portable executable malware
Proceedings of the Fifth International Conference on Security of Information and Networks
Behavioural detection with API call-grams to identify malicious PE files
Proceedings of the First International Conference on Security of Internet of Things
Malware detection by pruning of parallel ensembles using harmony search
Pattern Recognition Letters
Hi-index | 0.00 |
Financial loss due to malware nearly doubles every two years. For instance in 2006, malware caused near 33.5 Million GBP direct financial losses only to member organizations of banks in UK. Recent malware cannot be detected by traditional signature based anti-malware tools due to their polymorphic and/or metamorphic nature. Malware detection based on its immutable characteristics has been a recent industrial practice. The datasets are not public. Thus the results are not reproducible and conducting research in academic setting is difficult. In this work, we not only have improved a recent method of malware detection based on mining Application Programming Interface (API) calls significantly, but also have created the first public dataset to promote malware research. Our technique first reads API call sets used in a collection of Portable Executable (PE) files, then generates a set of discriminative and domain interpretable features. These features are then used to train a classifier to detect unseen malware. We have achieved detection rate of 99.7% while keeping accuracy as high as 98.3%. Our method improved state of the art technology in several aspects: accuracy by 5.24%, detection rate by 2.51% and false alarm rate was decreased from 19.86% to 1.51%. This project's data and source code can be found at http://home.shirazu.ac.ir/~sami/malware.