MORPHEUS: motif oriented representations to purge hostile events from unlabeled sequences

  • Authors:
  • Gaurav Tandon;Philip Chan;Debasis Mitra

  • Affiliations:
  • Florida Institute of Technology;Florida Institute of Technology;Florida Institute of Technology

  • Venue:
  • Proceedings of the 2004 ACM workshop on Visualization and data mining for computer security
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most of the prevalent anomaly detection systems use some training data to build models. These models are then utilized to capture any deviations resulting from possible intrusions. The efficacy of such systems is highly dependent upon a training data set free of attacks. "Clean" or labeled training data is hard to obtain. This paper addresses the very practical issue of refinement of unlabeled data to obtain a clean data set which can then train an online anomaly detection system. Our system, called MORPHEUS, represents a system call sequence using the spatial positions of motifs (subsequences) within the sequence. We also introduce a novel representation called sequence space to denote all sequences with respect to a reference sequence. Experiments on well known data sets indicate that our sequence space can be effectively used to purge anomalies from unlabeled sequences. Although an unsupervised anomaly detection system in itself, our technique is used for data purification. A "clean" training set thus obtained improves the performance of existing online host-based anomaly detection systems by increasing the number of attack detections.