Mining Outliers with Adaptive Cutoff Update and Space Utilization (RACAS)

  • Authors:
  • Chi-Cheong Szeto;Edward Hung

  • Affiliations:
  • Department of Computing, The Hong Kong Polytechnic University, Hong Kong, email: csccszeto@comp.polyu.edu.hk;Department of Computing, The Hong Kong Polytechnic University, Hong Kong, email: csehung@comp.polyu.edu.hk

  • Venue:
  • Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently the efficiency of an outlier detection algorithm ORCA was improved by RCS (Randomization with faster Cutoff update and Space utilization after pruning), which changes the frequencies of updating the cutoff value and reclaiming the memory space at some pre-specified time. How and when to change the frequencies were only determined empirically. However, the optimal setting may vary for different data sets and computers with different CPU and disk I/O performance. In this paper, we theoretically formulate two methods to further reduce the execution time of RCS by dynamically adapting the frequencies at each step to different data sets and computers with different CPU and disk I/O performance. We conducted experiments on a KDD CUP real data set from a network intrusion detection problem under different conditions. The results show that our substantial time-saving from optimized ORCA is up to five times that of RCS and increases with the relative disk I/O cost, the percentage of outliers to find and the data set size.