Density-based data clustering algorithms for lower dimensions using space-filling curves

  • Authors:
  • Bin Xu;Danny Z. Chen

  • Affiliations:
  • Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN;Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN

  • Venue:
  • PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present two new density-based algorithms for clustering data points in lower dimensions (dimensions ≤ 10). Both our algorithms compute density-based clusters and noises in O(n) CPU time, space, and I/O cost, under some reasonable assumptions, where n is the number of input points. Besides packing the data structure into buckets and using block access techniques to reduce the I/O cost, our algorithms apply space-filling curve techniques to reduce the disk access operations. Our first algorithm (Algorithm A) focuses on handling not highly clustered input data, while the second algorithm (Algorithm B) focuses on highly clustered input data. We implemented our algorithms, evaluated the effects of various space-filling curves, identified the best space-filling curve for our approaches, and conducted extensive performance evaluation. The experiments show the high performance of our algorithms. We believe that our algorithms are of considerable practical value.