An Incremental-Learning Method for Supervised Anomaly Detection by Cascading Service Classifier and ITI Decision Tree Methods

  • Authors:
  • Wei-Yi Yu;Hahn-Ming Lee

  • Affiliations:
  • Department of Computer Science and Information Engineering National Taiwan, University of Science and Technology, Taipei, Taiwan, R.O.C 106;Department of Computer Science and Information Engineering National Taiwan, University of Science and Technology, Taipei, Taiwan, R.O.C 106

  • Venue:
  • PAISI '09 Proceedings of the Pacific Asia Workshop on Intelligence and Security Informatics
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, the incremental learning method to cascade Service Classifier and ITI (incremental tree inducer) methods for supervised anomaly detection, called "SC+ITI", is proposed for classifying anomalous and normal instances in a computer network. Since the ITI method can not handle new instances with new service value, the SC+ITI cascading method is proposed to avoid this. Two steps are in SC+ITI cascading methods. First, the Service Classifier method partitions the training instances into n service clusters according to different service value. Second, in order to avoid handling instances with new service value, the ITI method is trained with instances with the same service value in the cluster. In 2007, Gaddam et al. showed KMeans+ID3 cascading method which mitigates two problems 1) the Forced Assignment problem and 2) the Class Dominance problem. His method with Nearest Neighbor (NN) combination rule outperforms the other three methods (i.e., K-Means, ID3 and KMeans+ID3 with Nearest Consensus rule) over the 1998 MIT-DARPA data set. Since the KDD'99 data set was also extracted from the 1998 MIT-DARPA data set, Nearest Neighbor combination rule within K-Means+ITI and SOM+ITI cascading methods is used in our experiments. We compare the performance of SC+ITI with the K-Means, SOM, ITI, K-Means+ITI and SOM+ITI methods in terms of the Detection Rate and False Positive Rate (FPR) over the KDD'99 data set. The results show that the ITI method have better performance than the K-Means, SOM, K-Means+ITI and SOM+ITI methods in terms of the overall Detection Rate. Our method, the Service Classifier and ITI cascading method outperforms the ITI method in terms of the Detection Rate and FPR and shows better Detection Rate as compared to other methods. Like the ITI method, our method also provides the additional options of handling missing values data and incremental learning.