An evaluation of automatic parameter tuning of a statistics-based anomaly detection algorithm

  • Authors:
  • Yosuke Himura;Kensuke Fukuda;Kenjiro Cho;Hiroshi Esaki

  • Affiliations:
  • Esaki laboratory, Graduate School of Information Science and Technology, University of Tokyo, Tokyo, Japan;National Institute of Informatics, PRESTO, JST, Tokyo, Japan;Internet Initiative Japan, Tokyo, Japan;Graduate School of Information Science and Technology, University of Tokyo, Tokyo, Japan

  • Venue:
  • International Journal of Network Management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate an automatic and dynamic parameter tuning of a statistical method for detecting anomalies in network traffic (this tuning is referred to as parameter learning) towards real-time detection. The main idea behind the dynamic tuning is to predict an appropriate parameter for upcoming traffic by considering the detection results of past τ traces of traffic. The τ is referred to as the learning period, and we discuss in particular the appropriate value of τ. This automatic tuning scheme is applied to parameter setting of an anomaly detection method based on Sketch and the multi-scale gamma model, which is an unsupervised method and does not need predefined data. We analyze the tuning scheme with real traffic traces measured on a trans-Pacific link over 9 years (15 min from 14:00 Japan Standard Time every day, and 24 consecutive hours for some dates on the same link). The detection results with parameter prediction are compared to those with ideal parameters that maximize the detection performance for upcoming traffic. We also analyze predictability of the ideal parameter considering the past changes in it. The main findings of this work are as follows: (1) the ideal parameter fluctuates day by day; (2) parameter learning with a longer τ is affected by significant events included in the period, and the appropriate τ is about three traces (days) for everyday 15 min traces and around 1.5 h for 24 h traces; (3) the degradation in detection performance caused by introducing parameter learning is 17% with τ = 3 for everyday 15 min traces; (4) the changes in the ideal parameter have no periodic correlation, and can be modeled as a random process followed by a normal distribution. We show that one cannot consistently use a fixed parameter in statistics-based algorithms to detect anomalies in practice. Copyright © 2010 John Wiley & Sons, Ltd.