Processing of massive audit data streams for real-time anomaly intrusion detection

  • Authors:
  • Wei Wang;Xiaohong Guan;Xiangliang Zhang

  • Affiliations:
  • State Key Laboratory for Manufacturing Systems (SKLMS) and MOE Key Lab for Intelligent Networks and Network Security (KLINNS), Xi'an Jiaotong University, Xi'an 710049, China;State Key Laboratory for Manufacturing Systems (SKLMS) and MOE Key Lab for Intelligent Networks and Network Security (KLINNS), Xi'an Jiaotong University, Xi'an 710049, China and Center for Intelli ...;State Key Laboratory for Manufacturing Systems (SKLMS) and MOE Key Lab for Intelligent Networks and Network Security (KLINNS), Xi'an Jiaotong University, Xi'an 710049, China

  • Venue:
  • Computer Communications
  • Year:
  • 2008

Quantified Score

Hi-index 0.25

Visualization

Abstract

Intrusion detection is an important technique in the defense-in-depth network security framework. Most current intrusion detection models lack the ability to process massive audit data streams for real-time anomaly detection. In this paper, we present an effective anomaly intrusion detection model based on Principal Component Analysis (PCA). The model is more suitable for high speed processing of massive data streams in real-time from various data sources by considering the frequency property of audit events than by use of the transition property or the correlation property. It can serve as a general framework that a practical Intrusion Detection Systems (IDS) can be implemented in various computing environments. In this method, a multi-pronged anomaly detection model is used to monitor various computer system and network behaviors. Three sources of data, system call data from the University of New Mexico (lpr) and from KLINNS Lab of Xi'an Jiaotong University (ftp), shell command data from AT&T Research laboratory, and network data from MIT Lincoln Lab, are used to validate the model and the method. The frequencies of individual system calls generated by one process and of individual commands embedded in one command block as well as features extracted in one network connection are transformed into an input data vector. Our method is employed to reduce the high dimensional data vectors and thus the detection is handled in a lower dimension with high efficiency and low use of system resources. The distance between a vector and its reconstruction in the reduced subspace is used for anomaly detection. Empirical results show that our model is promising in terms of detection accuracy and computational efficiency, and thus amenable for real-time intrusion detection.