Clustering-based histograms for multi-dimensional data

  • Authors:
  • Filippo Furfaro;Giuseppe M. Mazzeo;Cristina Sirangelo

  • Affiliations:
  • DEIS, University of Calabria, Rende, Italy;DEIS, University of Calabria, Rende, Italy;DEIS, University of Calabria, Rende, Italy

  • Venue:
  • DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

A new technique for constructing multi-dimensional histograms is proposed. This technique first invokes a density-based clustering algorithm to locate dense and sparse regions of the input data. Then the data distribution inside each of these regions is summarized by partitioning it into non-overlapping blocks laid onto a grid. The granularity of this grid is chosen depending on the underlying data distribution: the more homogeneous the data, the coarser the grid. Our approach is compared with state-of-the-art histograms on both synthetic and real-life data and is shown to be more effective.