SPOT: A System for Detecting Projected Outliers From High-dimensional Data Streams

  • Authors:
  • Ji Zhang;Qigang Gao;Hai Wang

  • Affiliations:
  • Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada. jiz@cs.dal.ca;Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada. qggao@cs.dal.ca;Sobey School of Business, Saint Mary's University, Halifax, Nova Scotia, Canada. hwang@smu.ca

  • Venue:
  • ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a new technique, called Stream Projected Ouliter deTector (SPOT), to deal with outlier detection problem in high-dimensional data streams. SPOT is unique in a number of aspects. First, SPOT employs a novel window-based time model and decaying cell summaries to capture statistics from the data stream. Second, Sparse Scubspace Template (SST), a set of top sparse subspaces obtained by unsupervised and/or supervised learning processes, is constructed in SPOT to detect projected outliers effectively. Multi-Objective Genetic Algorithm (MOGA) is employed as an effective search method in unsupervised learning for finding outlying subspaces from training data. Finally, SST is able to carry out online self-evolution to cope with dynamics of data streams. This paper provides details on the motivation and technical challenges of detecting outliers from high-dimensional data streams, present an overview of SPOT, and give the plans for system demonstration of SPOT.