Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Data Mining Methods for Detection of New Malicious Executables
SP '01 Proceedings of the 2001 IEEE Symposium on Security and Privacy
Learning to detect malicious executables in the wild
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
PolyUnpack: Automating the Hidden-Code Extraction of Unpack-Executing Malware
ACSAC '06 Proceedings of the 22nd Annual Computer Security Applications Conference
Using an Ensemble of One-Class SVM Classifiers to Harden Payload-based Anomaly Detection Systems
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Renovo: a hidden code extractor for packed executables
Proceedings of the 2007 ACM workshop on Recurring malcode
Behavior-based malware detection
Behavior-based malware detection
Malware detection using adaptive data compression
Proceedings of the 1st ACM workshop on Workshop on AISec
Eureka: A Framework for Enabling Static Malware Analysis
ESORICS '08 Proceedings of the 13th European Symposium on Research in Computer Security: Computer Security
International Journal of Computer Applications in Technology
Semi-Supervised Learning
AccessMiner: using system-centric models for malware protection
Proceedings of the 17th ACM conference on Computer and communications security
Idea: opcode-sequence-based malware detection
ESSoS'10 Proceedings of the Second international conference on Engineering Secure Software and Systems
Hi-index | 0.00 |
Malware is any computer software potentially harmful to both computers and networks. The amount of malware is growing every year and poses a serious global security threat. Signature-based detection is the most extended method in commercial antivirus software, however, it consistently fails to detect new malware. Supervised machine learning has been adopted to solve this issue, but the usefulness of supervised learning is far to be complete because it requires a high amount of malicious executables and benign software to be identified and labelled previously. In this paper, we propose a new method of malware detection that adopts a well-known semi-supervised learning approach to detect unknown malware. This method is based on examining the frequencies of the appearance of opcode sequences to build a semi-supervised machine-learning classifier using a set of labelled (either malware or legitimate software) and unlabelled instances. We performed an empirical validation demonstrating that the labelling efforts are lower than when supervised learning is used while the system maintains high accuracy rate.