Probabilistic suffix models for API sequence analysis of Windows XP applications

Authors:
Geoffrey Mazeroff;Jens Gregor;Michael Thomason;Richard Ford
Affiliations:
Department of Computer Science, University of Tennessee Knoxville, 203 Claxton Complex, Knoxville, TN 37996-3450, USA;Department of Computer Science, University of Tennessee Knoxville, 203 Claxton Complex, Knoxville, TN 37996-3450, USA;Department of Computer Science, University of Tennessee Knoxville, 203 Claxton Complex, Knoxville, TN 37996-3450, USA;Department of Computer Sciences, Florida Institute of Technology, 150 W. University Blvd., Melbourne, FL 32901, USA
Venue:
Pattern Recognition
Year:
2008

Citing 12
Cited 2

Introduction to algorithms

Introduction to algorithms
The power of amnesia: learning probabilistic automata with variable memory length

Machine Learning - Special issue on COLT '94
Intrusion detection with neural networks

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Neutralizing windows-based malicious mobile code

Proceedings of the 2002 ACM symposium on Applied computing
Fusion of multiple classifiers for intrusion detection in computer networks

Pattern Recognition Letters
A Sense of Self for Unix Processes

SP '96 Proceedings of the 1996 IEEE Symposium on Security and Privacy
Data Mining Methods for Detection of New Malicious Executables

SP '01 Proceedings of the 2001 IEEE Symposium on Security and Privacy
The Art of Computer Virus Research and Defense

The Art of Computer Virus Research and Defense
Pattern Recognition, Third Edition

Pattern Recognition, Third Edition
Malware: Fighting Malicious Code

Malware: Fighting Malicious Code
Intrusion detection using sequences of system calls

Journal of Computer Security
Using Machine-Learning Methods for Musical Style Modeling

Computer

Markov models for application behavior analysis

Proceedings of the 4th annual workshop on Cyber security and information intelligence research: developing strategies to meet the cyber security and information intelligence challenges ahead
A fast kernel on hierarchial tree structures and its application to windows application behavior analysis

ICONIP'10 Proceedings of the 17th international conference on Neural information processing: models and applications - Volume Part II

Quantified Score

Hi-index	0.01

Visualization

Abstract

Given the pervasive nature of malicious mobile code (viruses, worms, etc.), developing statistical/structural models of code execution is of considerable importance. We investigate using probabilistic suffix trees (PSTs) and associated suffix automata (PSAs) to build models of benign application behavior with the goal of subsequently being able to detect malicious applications as anything that deviates therefrom. We describe these probabilistic suffix models and present new generic analysis and manipulation algorithms. The models and the algorithms are then used in the context of API (i.e., system call) sequences realized by Windows XP applications. The analysis algorithms, when applied to traces (i.e., sequences of API calls) of benign and malicious applications, aid in choosing an appropriate modeling strategy in terms of distance metrics and consequently provide classification measures in terms of sequence-to-model matching. We give experimental results based on classification of unobserved traces of benign and malicious applications against a suffix model trained solely from traces generated by a small set of benign applications.