A close look on n-grams in intrusion detection: anomaly detection vs. classification

  • Authors:
  • Christian Wressnegger;Guido Schwenk;Daniel Arp;Konrad Rieck

  • Affiliations:
  • idalab GmbH, Berlin, Germany;Berlin University of Technology, Berlin, Germany;University of Göttingen, Göttingen, Germany;University of Göttingen, Göttingen, Germany

  • Venue:
  • Proceedings of the 2013 ACM workshop on Artificial intelligence and security
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Detection methods based on n-gram models have been widely studied for the identification of attacks and malicious software. These methods usually build on one of two learning schemes: anomaly detection, where a model of normality is constructed from n-grams, or classification, where a discrimination between benign and malicious n-grams is learned. Although successful in many security domains, previous work falls short of explaining why a particular scheme is used and more importantly what renders one favorable over the other for a given type of data. In this paper we provide a close look on n-gram models for intrusion detection. We specifically study anomaly detection and classification using n-grams and develop criteria for data being used in one or the other scheme. Furthermore, we apply these criteria in the scope of web intrusion detection and empirically validate their effectiveness with different learning-based detection methods for client-side and service-side attacks.