Using randomized projection techniques to aid in detecting high-dimensional malicious applications

Authors:
Jan Durand;Travis Atkison
Affiliations:
Louisiana Tech University, Ruston, LA;Louisiana Tech University, Ruston, LA
Venue:
Proceedings of the 49th Annual Southeast Regional Conference
Year:
2011

Citing 13
Cited 2

Latent semantic indexing: a probabilistic analysis

Journal of Computer and System Sciences - Special issue on the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
A vector space model for automatic indexing

Communications of the ACM
Characterizing the behavior of a program using multiple-length N-grams

Proceedings of the 2000 workshop on New security paradigms
Database-friendly random projections

PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Random projection in dimensionality reduction: applications to image and text data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Modern Information Retrieval

Modern Information Retrieval
N-Gram-Based Detection of New Malicious Code

COMPSAC '04 Proceedings of the 28th Annual International Computer Software and Applications Conference - Workshops and Fast Abstracts - Volume 02
Learning similarity measures in non-orthogonal space

Proceedings of the thirteenth ACM international conference on Information and knowledge management
A Feature Selection and Evaluation Scheme for Computer Virus Detection

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Learning to Detect and Classify Malicious Executables in the Wild

The Journal of Machine Learning Research
Applying randomized projection to aid prediction algorithms in detecting high-dimensional rogue applications

Proceedings of the 47th Annual Southeast Regional Conference
Biologically inspired defenses against computer viruses

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Aiding prediction algorithms in detecting high-dimensional malicious applications using a randomized projection technique

Proceedings of the 48th Annual Southeast Regional Conference

Applying random projection to the classification of malicious applications using data mining algorithms

Proceedings of the 50th Annual Southeast Regional Conference
A high-dimensional two-sample test for the mean using random subspaces

Computational Statistics & Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

This work is part of an on-going effort in using randomized projection as a feature extraction and reduction method to improve a cosine similarity, information retrieval technique to enhance the detection of known malicious applications and their variations. We follow a standard information retrieval methodology that allows software to be regarded as documents in the corpus. This provides the ability to search the corpus with a query, malicious software, and retrieve/identify potentially malicious software and other instances of the same type of vulnerability. In our experiments, we compare Gaussian-distributed random matrix randomized projection to two alternative methods of randomized projection, sparse matrix randomized projection and Linial-London-Rabinovich random set randomized projection, and assess their performance when applied to features of malicious applications extracted via the information retrieval technique of n-gram analysis. In our results, the Gaussian distributed random matrix approach outperformed the other methods with generally higher values for each observed performance metric, however, each algorithm showed promise in selected scenarios. These results support the hypothesis that applying the technique of random matrix projection as a dimensionality reduction method for the cosine similarity metric has merit in determining if an application may contain a malicious application.