Malware detection by pruning of parallel ensembles using harmony search

  • Authors:
  • Shina Sheen;R. Anitha;P. Sirisha

  • Affiliations:
  • -;-;-

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2013

Quantified Score

Hi-index 0.10

Visualization

Abstract

Detection of malware using data mining techniques has been explored extensively. Techniques used for detecting malware based on structural features rely on being able to identify anomalies in the structure of executable files. The structural attributes of an executable that can be extracted include byte ngrams, Portable Executable (PE) features, API call sequences and Strings. After a thorough analysis we have extracted various features from executable files and applied it on an ensemble of classifiers to efficiently detect malware. Ensemble methods combine several individual pattern classifiers in order to achieve better classification. The challenge is to choose the minimal number of classifiers that achieve the best performance. An ensemble that contains too many members might incur large storage requirements and even reduce the classification performance. Hence the goal of ensemble pruning is to identify a subset of ensemble members that performs at least as good as the original ensemble and discard any other members. In this paper we propose a novel idea of pruning ensemble using Harmony search which is a music inspired algorithm. The pruned ensemble is then used for malware detection. Multiple heterogeneous classifiers in parallel fashion are used for constructing the ensemble and harmony search is used to choose the best set of classifiers from the ensemble to get the pruned set. From the experimental results, it is evident that our algorithm achieves high detection accuracy and outperforms the existing ensemble algorithms.