Machine Learning
Cost complexity-based pruning of ensemble classifiers
Knowledge and Information Systems
On the Boosting Pruning Problem
ECML '00 Proceedings of the 11th European Conference on Machine Learning
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Data Mining Methods for Detection of New Malicious Executables
SP '01 Proceedings of the 2001 IEEE Symposium on Security and Privacy
Combining Pattern Classifiers: Methods and Algorithms
Combining Pattern Classifiers: Methods and Algorithms
Ensemble selection from libraries of models
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Rotation Forest: A New Classifier Ensemble Method
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Comparison of Decision Tree Ensemble Creation Techniques
IEEE Transactions on Pattern Analysis and Machine Intelligence
Ensemble Pruning Via Semi-definite Programming
The Journal of Machine Learning Research
Learning to Detect and Classify Malicious Executables in the Wild
The Journal of Machine Learning Research
EROS: Ensemble rough subspaces
Pattern Recognition
Improving malware detection by applying multi-inducer ensemble
Computational Statistics & Data Analysis
Music-Inspired Harmony Search Algorithm: Theory and Applications
Music-Inspired Harmony Search Algorithm: Theory and Applications
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Self-adaptive harmony search algorithm for optimization
Expert Systems with Applications: An International Journal
PE-Miner: Mining Structural Information to Detect Malicious Executables in Realtime
RAID '09 Proceedings of the 12th International Symposium on Recent Advances in Intrusion Detection
Malware detection based on mining API calls
Proceedings of the 2010 ACM Symposium on Applied Computing
Selective ensemble of decision trees
RSFDGrC'03 Proceedings of the 9th international conference on Rough sets, fuzzy sets, data mining, and granular computing
Malware detection using assembly and API call sequences
Journal in Computer Virology
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Ensemble pruning using harmony search
HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part II
Malicious codes detection based on ensemble learning
ATC'07 Proceedings of the 4th international conference on Autonomic and Trusted Computing
Opcode sequences as representation of executables for data-mining-based unknown malware detection
Information Sciences: an International Journal
Hi-index | 0.10 |
Detection of malware using data mining techniques has been explored extensively. Techniques used for detecting malware based on structural features rely on being able to identify anomalies in the structure of executable files. The structural attributes of an executable that can be extracted include byte ngrams, Portable Executable (PE) features, API call sequences and Strings. After a thorough analysis we have extracted various features from executable files and applied it on an ensemble of classifiers to efficiently detect malware. Ensemble methods combine several individual pattern classifiers in order to achieve better classification. The challenge is to choose the minimal number of classifiers that achieve the best performance. An ensemble that contains too many members might incur large storage requirements and even reduce the classification performance. Hence the goal of ensemble pruning is to identify a subset of ensemble members that performs at least as good as the original ensemble and discard any other members. In this paper we propose a novel idea of pruning ensemble using Harmony search which is a music inspired algorithm. The pruned ensemble is then used for malware detection. Multiple heterogeneous classifiers in parallel fashion are used for constructing the ensemble and harmony search is used to choose the best set of classifiers from the ensemble to get the pruned set. From the experimental results, it is evident that our algorithm achieves high detection accuracy and outperforms the existing ensemble algorithms.