Combining hidden Markov models for improved anomaly detection

  • Authors:
  • Wael Khreich;Eric Granger;Robert Sabourin;Ali Miri

  • Affiliations:
  • Laboratoire d'imagerie, de vision et d'intelligence, artificielle, École de technologie supérieure, Montreal, QC, Canada;Laboratoire d'imagerie, de vision et d'intelligence, artificielle, École de technologie supérieure, Montreal, QC, Canada;Laboratoire d'imagerie, de vision et d'intelligence, artificielle, École de technologie supérieure, Montreal, QC, Canada;School of Information Technology and Engineering, University of Ottawa, Ottawa, ON, Canada

  • Venue:
  • ICC'09 Proceedings of the 2009 IEEE international conference on Communications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In host-based intrusion detection systems (HIDS), anomaly detection involves monitoring for significant deviations from normal system behavior. Hidden Markov Models (HMMs) have been shown to provide a high level performance for detecting anomalies in sequences of system calls to the operating system kernel. Although the number of hidden states is a critical parameter for HMM performance, it is often chosen heuristically or empirically, by selecting the single value that provides the best performance on training data. However, this single best HMM does not typically provide a high level of performance over the entire detection space. This paper presents a multiple-HMMs approach, where each HMM is trained using a different number of hidden states, and where HMM responses are combined in the Receiver Operating Characteristics (ROC) space according to the Maximum Realizable ROC (MRROC) technique. The performance of this approach is compared favorably to that of a single best HMM and to a traditional sequence matching technique called STIDE, using different synthetic HIDS data sets. Results indicate that this approach provides a higher level of performance over a wide range of training set sizes with various alphabet sizes and irregularity indices, and different anomaly sizes, without a significant computational and storage overhead.