A supervised topic transition model for detecting malicious system call sequences

Authors:
Han Xiao;Thomas Stibor
Affiliations:
Technische Universität München, Garching, Germany;Technische Universität München, Garching, Germany
Venue:
Proceedings of the 2011 workshop on Knowledge discovery, modeling and simulation
Year:
2011

Citing 7
Cited 0

Latent dirichlet allocation

The Journal of Machine Learning Research
Topic modeling: beyond bag-of-words

ICML '06 Proceedings of the 23rd international conference on Machine learning
Learning to Detect and Classify Malicious Executables in the Wild

The Journal of Machine Learning Research
Intrusion detection using sequences of system calls

Journal of Computer Security
Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Probabilistic latent semantic analysis

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a probabilistic model for behavior-based malware detection that jointly models sequential data and class labels. Given labeled sequences (harmless/malicious), our goal is to reveal behavior patterns and exploit them to predict class labels of unknown sequences. The proposed model is a novel extension of supervised latent Dirichlet allocation with an estimation algorithm that alternates between Gibbs sampling and gradient descent. Experiments on real-world data set show that our model can learn meaningful patterns, and provides competitive performance on the malware detection task. Moreover, we parallelize the training algorithm and demonstrate scalability with varying numbers of processors.