An Empirical Study of Two Approaches to Sequence Learning for Anomaly Detection

Authors:
Terran Lane;Carla E. Brodley
Affiliations:
Department of Computer Science, University of New Mexico, Albuquerque, NM, USA. terran@cs.unm.edu;School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA. brodley@ecn.purdue.edu
Venue:
Machine Learning
Year:
2003

Citing 26
Cited 11

An Intrusion-Detection Model

IEEE Transactions on Software Engineering - Special issue on computer security and privacy
Learning regular sets from queries and counterexamples

Information and Computation
Inference of finite automata using homing sequences

STOC '89 Proceedings of the twenty-first annual ACM symposium on Theory of computing
Inferring graphs from walks

COLT '90 Proceedings of the third annual workshop on Computational learning theory
Instance-Based Learning Algorithms

Machine Learning
A Nearest Hyperrectangle Learning Method

Machine Learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Fundamentals of speech recognition

Fundamentals of speech recognition
Discrete Sequence Prediction and Its Applications

Machine Learning
Learning to recognize promoter sequences in E. coli by modeling uncertainty in the training data

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Automated user modeling for intelligent interface

International Journal of Human-Computer Interaction
Security in computing

Security in computing
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Time-series similarity problems and well-separated geometric sets

SCG '97 Proceedings of the thirteenth annual symposium on Computational geometry
Robust classification systems for imprecise environments

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Intrusion detection with neural networks

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Activity monitoring: noticing interesting changes in behavior

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Temporal sequence learning and data reduction for anomaly detection

ACM Transactions on Information and System Security (TISSEC)
Reduction Techniques for Instance-BasedLearning Algorithms

Machine Learning
Finding Similar Time Series

PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
An Architecture for Intrusion Detection Using Autonomous Agents

ACSAC '98 Proceedings of the 14th Annual Computer Security Applications Conference
Testing for Human Perceptual Categories in a Physician-in-the-loop CBIR System for Medical Imagery

CBAIVL '99 Proceedings of the IEEE Workshop on Content-Based Access of Image and Video Libraries
A Sense of Self for Unix Processes

SP '96 Proceedings of the 1996 IEEE Symposium on Security and Privacy
Machine learning techniques for the computer security domain of anomaly detection

Machine learning techniques for the computer security domain of anomaly detection
Handbook of Parametric and Nonparametric Statistical Procedures

Handbook of Parametric and Nonparametric Statistical Procedures
Rule induction and instance-based learning a unified approach

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Knows what it knows: a framework for self-aware learning

Proceedings of the 25th international conference on Machine learning
Creating User Profiles from a Command-Line Interface: A Statistical Approach

UMAP '09 Proceedings of the 17th International Conference on User Modeling, Adaptation, and Personalization: formerly UM and AH
Testing terrorism theory with data mining

International Journal of Data Analysis Techniques and Strategies
Sequence classification using statistical pattern recognition

IDA'07 Proceedings of the 7th international conference on Intelligent data analysis
In-depth behavior understanding and use: The behavior informatics approach

Information Sciences: an International Journal
Network intrusion detection based on system calls and data mining

Frontiers of Computer Science in China
SBAD: sequence based attack detection via sequence comparison

PSDML'10 Proceedings of the international ECML/PKDD conference on Privacy and security issues in data mining and machine learning
Inverse Reinforcement Learning in Partially Observable Environments

The Journal of Machine Learning Research
Estimating accuracy of mobile-masquerader detection using worst-case and best-case scenario

ICICS'06 Proceedings of the 8th international conference on Information and Communications Security
A survey of techniques for incremental learning of HMM parameters

Information Sciences: an International Journal
A variable-length model for masquerade detection

Journal of Systems and Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces the computer security domain of anomaly detection and formulates it as a machine learning task on temporal sequence data. In this domain, the goal is to develop a model or profile of the normal working state of a system user and to detect anomalous conditions as long-term deviations from the expected behavior patterns. We introduce two approaches to this problem: one employing instance-based learning (IBL) and the other using hidden Markov models (HMMs). Though not suitable for a comprehensive security solution, both approaches achieve anomaly identification performance sufficient for a low-level “focus of attention” detector in a multitier security system. Further, we evaluate model scaling techniques for the two approaches: two clustering techniques for the IBL approach and variation of the number of hidden states for the HMM approach. We find that over both model classes and a wide range of model scales, there is no significant difference in performance at recognizing the profiled user. We take this invariance as evidence that, in this security domain, limited memory models (e.g., fixed-length instances or low-order Markov models) can learn only part of the user identity information in which we're interested and that substantially different models will be necessary if dramatic improvements in user-based anomaly detection are to be achieved.