A novel unsupervised classification approach for network anomaly detection by k-Means clustering and ID3 decision tree learning methods

  • Authors:
  • Yasser Yasami;Saadat Pour Mozaffari

  • Affiliations:
  • Computer Engineering Department, Amirkabir University of Technology (AUT), Tehran, Iran;Computer Engineering Department, Amirkabir University of Technology (AUT), Tehran, Iran

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a novel host-based combinatorial method based on k-Means clustering and ID3 decision tree learning algorithms for unsupervised classification of anomalous and normal activities in computer network ARP traffic. The k-Means clustering method is first applied to the normal training instances to partition it into k clusters using Euclidean distance similarity. An ID3 decision tree is constructed on each cluster. Anomaly scores from the k-Means clustering algorithm and decisions of the ID3 decision trees are extracted. A special algorithm is used to combine results of the two algorithms and obtain final anomaly score values. The threshold rule is applied for making the decision on the test instance normality. Experiments are performed on captured network ARP traffic. Some anomaly criteria has been defined and applied to the captured ARP traffic to generate normal training instances. Performance of the proposed approach is evaluated using five defined measures and empirically compared with the performance of individual k-Means clustering and ID3 decision tree classification algorithms and the other proposed approaches based on Markovian chains and stochastic learning automata. Experimental results show that the proposed approach has specificity and positive predictive value of as high as 96 and 98%, respectively.