Rogue programs: viruses, worms and Trojan horses
Rogue programs: viruses, worms and Trojan horses
C4.5: programs for machine learning
C4.5: programs for machine learning
Latent semantic indexing: a probabilistic analysis
Journal of Computer and System Sciences - Special issue on the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
Characterizing the behavior of a program using multiple-length N-grams
Proceedings of the 2000 workshop on New security paradigms
Database-friendly random projections
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Random projection in dimensionality reduction: applications to image and text data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Modern Information Retrieval
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Software forensics: old methods for a new science
SEEP '96 Proceedings of the 1996 International Conference on Software Engineering: Education and Practice (SE:EP '96)
Learning to detect malicious executables in the wild
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
N-Gram-Based Detection of New Malicious Code
COMPSAC '04 Proceedings of the 28th Annual International Computer Software and Applications Conference - Workshops and Fast Abstracts - Volume 02
Detection and identification of network anomalies using sketch subspaces
Proceedings of the 6th ACM SIGCOMM conference on Internet measurement
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Learning to Detect and Classify Malicious Executables in the Wild
The Journal of Machine Learning Research
ACSAC '08 Proceedings of the 2008 Annual Computer Security Applications Conference
Proceedings of the 47th Annual Southeast Regional Conference
Biologically inspired defenses against computer viruses
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Proceedings of the 48th Annual Southeast Regional Conference
Using randomized projection techniques to aid in detecting high-dimensional malicious applications
Proceedings of the 49th Annual Southeast Regional Conference
Shared information and program plagiarism detection
IEEE Transactions on Information Theory
Hi-index | 0.00 |
This research is part of a continuing effort to show the viability of using random projection as a feature extraction and reduction technique in the classification of malware to produce more accurate classifiers. In this paper, we use a vector space model with n-gram analysis to produce weighted feature vectors from binary executables, which we then reduce to a smaller feature set using the random projection method proposed by Achlioptas, and the feature selection method of mutual information to produce two separate data sets. We then apply several popular machine learning algorithms including J48 decision tree, naïve Bayes, support vector machines, and an instance-based learner to the data sets to produce classifiers for the detection of malicious executables. We evaluate the performance of the different classifiers and discover that using a data set reduced by random projection can improve the performance of support vector machine and instance-based learner classifiers.