Generalized projected clustering in high-dimensional data streams

  • Authors:
  • Ting Wang

  • Affiliations:
  • Computer Science Dept., University of British Columbia

  • Venue:
  • APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Consider the problem of identifying dense subgroups of data points exhibiting strong correlations in data stream. Such correlation connected clusters are meaningful in many applications. However, the inherent sparsity of high-dimensional space means that the correlations are local for specific subspace, and moreover, the correlation itself can be of arbitrarily complex direction, which blinds most traditional methods. We present ACID, a framework that can effectively detect correlation connected clusters in high dimensional stream. It has high scalability on both the size of stream and the dimension of data, and is robust against noise. Experiments on synthetic and real datasets are done to show its effectiveness and efficiency.