Using randomized projection techniques to aid in detecting high-dimensional malicious applications

  • Authors:
  • Jan Durand;Travis Atkison

  • Affiliations:
  • Louisiana Tech University, Ruston, LA;Louisiana Tech University, Ruston, LA

  • Venue:
  • Proceedings of the 49th Annual Southeast Regional Conference
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This work is part of an on-going effort in using randomized projection as a feature extraction and reduction method to improve a cosine similarity, information retrieval technique to enhance the detection of known malicious applications and their variations. We follow a standard information retrieval methodology that allows software to be regarded as documents in the corpus. This provides the ability to search the corpus with a query, malicious software, and retrieve/identify potentially malicious software and other instances of the same type of vulnerability. In our experiments, we compare Gaussian-distributed random matrix randomized projection to two alternative methods of randomized projection, sparse matrix randomized projection and Linial-London-Rabinovich random set randomized projection, and assess their performance when applied to features of malicious applications extracted via the information retrieval technique of n-gram analysis. In our results, the Gaussian distributed random matrix approach outperformed the other methods with generally higher values for each observed performance metric, however, each algorithm showed promise in selected scenarios. These results support the hypothesis that applying the technique of random matrix projection as a dimensionality reduction method for the cosine similarity metric has merit in determining if an application may contain a malicious application.