FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection

  • Authors:
  • Keith Noto;Carla Brodley;Donna Slonim

  • Affiliations:
  • Department of Computer Science, Tufts University, Medford, USA 02155;Department of Computer Science, Tufts University, Medford, USA 02155;Department of Computer Science, Tufts University, Medford, USA 02155

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Anomaly detection involves identifying rare data instances (anomalies) that come from a different class or distribution than the majority (which are simply called "normal" instances). Given a training set of only normal data, the semi-supervised anomaly detection task is to identify anomalies in the future. Good solutions to this task have applications in fraud and intrusion detection. The unsupervised anomaly detection task is different: Given unlabeled, mostly-normal data, identify the anomalies among them. Many real-world machine learning tasks, including many fraud and intrusion detection tasks, are unsupervised because it is impractical (or impossible) to verify all of the training data. We recently presented FRaC, a new approach for semi-supervised anomaly detection. FRaC is based on using normal instances to build an ensemble of feature models, and then identifying instances that disagree with those models as anomalous. In this paper, we investigate the behavior of FRaC experimentally and explain why FRaC is so successful. We also show that FRaC is a superior approach for the unsupervised as well as the semi-supervised anomaly detection task, compared to well-known state-of-the-art anomaly detection methods, LOF and one-class support vector machines, and to an existing feature-modeling approach.