GraphZip: a fast and automatic compression method for spatial data clustering

  • Authors:
  • Yu Qian;Kang Zhang

  • Affiliations:
  • The University of Texas at Dallas, Richardson, TX;The University of Texas at Dallas, Richardson, TX

  • Venue:
  • Proceedings of the 2004 ACM symposium on Applied computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.01

Visualization

Abstract

Spatial data mining presents new challenges due to the large size and the high dimensionality of spatial data. A common approach to such challenges is to perform some form of compression on the initial databases and then process the compressed data. This paper presents a novel spatial data compression method, called GraphZip, to produce a compact representation of the original data set. GraphZip has two advantages: first, the spatial pattern of the original data set is preserved in the compressed data. Second, arbitrarily dimensional data can be processed efficiently and automatically. Applying GraphZip to huge databases can enhance both the effectiveness and the efficiency of spatial data clustering. On one hand, performing a clustering algorithm on the compressed data set requires less running time while the pattern can still be discovered. On the other hand, the complexity of clustering is dramatically reduced. A general hierarchical clustering method using GraphZip is proposed in this paper. The experimental studies on four benchmark spatial data sets produce very encouraging results.