Detection and prediction of distance-based outliers

  • Authors:
  • Fabrizio Angiulli;Stefano Basta;Clara Pizzuti

  • Affiliations:
  • ICAR-CNR, Via Pietro Bucci, Rende (CS), Italy;ICAR-CNR, Via Pietro Bucci, Rende (CS), Italy;ICAR-CNR, Via Pietro Bucci, Rende (CS), Italy

  • Venue:
  • Proceedings of the 2005 ACM symposium on Applied computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present an unsupervised distance-based outlier detection method designed to learn a model over the objects contained in a data set. The learned model, called solving set, is a small subset of the data set that is used to classify new unseen objects as outliers or not. We provide an algorithm that computes a solving set with sub-quadratic time requirements, and we give experimental evidence that the computed solving set is small and that the false positive rate, i.e. the fraction of new objects misclassified as outliers using the solving set instead of the overall data set, is negligible.