Using the Real Dimension of the Data

  • Authors:
  • Christian Zirkelbach

  • Affiliations:
  • -

  • Venue:
  • DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a method for extracting the real dimension of a large data set in a high-dimensional data cube and indicates its use for visual data mining. A similarity measure structures a data set in a general, but weak sense. If the elements are part of a high-dimensional host space (primary space), for instance a data warehouse cube, the resulting structure doesn't necessarily reflect the real dimension of the embedded (secondary) space. Mapping the set into the secondary space of lower dimension will not result in loss of information with regard to the semantics defined by the measure. However, it helps to reduce storage and computing efforts. Additionally, the secondary space itself reveals much about the set's structure and can facilitate data mining. We make a proposal for adding the property of a dimension to a metric and show how to determine the real (in general fractal) dimension of the underlying data set.