Systematic construction of anomaly detection benchmarks from real data

  • Authors:
  • Andrew F. Emmott;Shubhomoy Das;Thomas Dietterich;Alan Fern;Weng-Keen Wong

  • Affiliations:
  • Oregon State University, Corvallis, Oregon;Oregon State University, Corvallis, Oregon;Oregon State University, Corvallis, Oregon;Oregon State University, Corvallis, Oregon;Oregon State University, Corvallis, Oregon

  • Venue:
  • Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Research in anomaly detection suffers from a lack of realistic and publicly-available problem sets. This paper discusses what properties such problem sets should possess. It then introduces a methodology for transforming existing classification data sets into ground-truthed benchmark data sets for anomaly detection. The methodology produces data sets that vary along three important dimensions: (a) point difficulty, (b) relative frequency of anomalies, and (c) clusteredness. We apply our generated datasets to benchmark several popular anomaly detection algorithms under a range of different conditions.