Malware detection based on mining API calls

Authors:
Ashkan Sami;Babak Yadegari;Hossein Rahimi;Naser Peiravian;Sattar Hashemi;Ali Hamze
Affiliations:
Shiraz University, Shiraz, Iran;Shiraz University, Shiraz, Iran;Shiraz University, Shiraz, Iran;Shiraz University, Shiraz, Iran;Shiraz University, Shiraz, Iran;Shiraz University, Shiraz, Iran
Venue:
Proceedings of the 2010 ACM Symposium on Applied Computing
Year:
2010

Citing 12
Cited 4

Random Forests

Machine Learning
Data Mining Methods for Detection of New Malicious Executables

SP '01 Proceedings of the 2001 IEEE Symposium on Security and Privacy
Detection of injected, dynamically generated, and obfuscated malicious code

Proceedings of the 2003 ACM workshop on Rapid malcode
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
A study of the behavior of several methods for balancing machine learning training data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Learning to detect malicious executables in the wild

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Static Analyzer of Vicious Executables (SAVE)

ACSAC '04 Proceedings of the 20th Annual Computer Security Applications Conference
Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy

IEEE Transactions on Pattern Analysis and Machine Intelligence
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Static analysis of executables to detect malicious patterns

SSYM'03 Proceedings of the 12th conference on USENIX Security Symposium - Volume 12
Classification of software behaviors for failure detection: a discriminative pattern mining approach

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

An improvement for fast-flux service networks detection based on data mining techniques

RSFDGrC'11 Proceedings of the 13th international conference on Rough sets, fuzzy sets, data mining and granular computing
Mining control flow graph as API call-grams to detect portable executable malware

Proceedings of the Fifth International Conference on Security of Information and Networks
Behavioural detection with API call-grams to identify malicious PE files

Proceedings of the First International Conference on Security of Internet of Things
Malware detection by pruning of parallel ensembles using harmony search

Pattern Recognition Letters

Quantified Score

Hi-index	0.00

Visualization

Abstract

Financial loss due to malware nearly doubles every two years. For instance in 2006, malware caused near 33.5 Million GBP direct financial losses only to member organizations of banks in UK. Recent malware cannot be detected by traditional signature based anti-malware tools due to their polymorphic and/or metamorphic nature. Malware detection based on its immutable characteristics has been a recent industrial practice. The datasets are not public. Thus the results are not reproducible and conducting research in academic setting is difficult. In this work, we not only have improved a recent method of malware detection based on mining Application Programming Interface (API) calls significantly, but also have created the first public dataset to promote malware research. Our technique first reads API call sets used in a collection of Portable Executable (PE) files, then generates a set of discriminative and domain interpretable features. These features are then used to train a classifier to detect unseen malware. We have achieved detection rate of 99.7% while keeping accuracy as high as 98.3%. Our method improved state of the art technology in several aspects: accuracy by 5.24%, detection rate by 2.51% and false alarm rate was decreased from 19.86% to 1.51%. This project's data and source code can be found at http://home.shirazu.ac.ir/~sami/malware.