Applying random projection to the classification of malicious applications using data mining algorithms

  • Authors:
  • Jan Durand;Travis Atkison

  • Affiliations:
  • Louisiana Tech University, Ruston, LA;Louisiana Tech University, Ruston, LA

  • Venue:
  • Proceedings of the 50th Annual Southeast Regional Conference
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This research is part of a continuing effort to show the viability of using random projection as a feature extraction and reduction technique in the classification of malware to produce more accurate classifiers. In this paper, we use a vector space model with n-gram analysis to produce weighted feature vectors from binary executables, which we then reduce to a smaller feature set using the random projection method proposed by Achlioptas, and the feature selection method of mutual information to produce two separate data sets. We then apply several popular machine learning algorithms including J48 decision tree, naïve Bayes, support vector machines, and an instance-based learner to the data sets to produce classifiers for the detection of malicious executables. We evaluate the performance of the different classifiers and discover that using a data set reduced by random projection can improve the performance of support vector machine and instance-based learner classifiers.