A vertical distance-based outlier detection method with local pruning

  • Authors:
  • Dongmei Ren;Imad Rahal;William Perrizo;Kirk Scott

  • Affiliations:
  • North Dakota State University, Fargo, ND;North Dakota State University, Fargo, ND;North Dakota State University, Fargo, ND;The University of Alaska Anchorage, Anchorage, AK

  • Venue:
  • Proceedings of the thirteenth ACM international conference on Information and knowledge management
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

"One person's noise is another person's signal". Outlier detection is used to clean up datasets and also to discover useful anomalies, such as criminal activities in electronic commerce, computer intrusion attacks, terrorist threats, agricultural pest infestations, etc. Thus, outlier detection is critically important in the information-based society. This paper focuses on finding outliers in large datasets using distance-based methods. First, to speedup outlier detections, we revise Knorr and Ng's distance-based outlier definition; second, a vertical data structure, instead of traditional horizontal structures, is adopted to facilitate efficient outlier detection further. We tested our methods against national hockey league dataset and show an order of magnitude of speed improvement compared to the contemporary distance-based outlier detection approaches.