The Journal of Machine Learning Research
Topic modeling: beyond bag-of-words
ICML '06 Proceedings of the 23rd international conference on Machine learning
Learning to Detect and Classify Malicious Executables in the Wild
The Journal of Machine Learning Research
Intrusion detection using sequences of system calls
Journal of Computer Security
Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
We propose a probabilistic model for behavior-based malware detection that jointly models sequential data and class labels. Given labeled sequences (harmless/malicious), our goal is to reveal behavior patterns and exploit them to predict class labels of unknown sequences. The proposed model is a novel extension of supervised latent Dirichlet allocation with an estimation algorithm that alternates between Gibbs sampling and gradient descent. Experiments on real-world data set show that our model can learn meaningful patterns, and provides competitive performance on the malware detection task. Moreover, we parallelize the training algorithm and demonstrate scalability with varying numbers of processors.