Using the Real Dimension of the Data

Authors:
Christian Zirkelbach
Affiliations:
-
Venue:
DaWaK '99 Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery
Year:
1999

Citing 15
Cited 0

Computational geometry: an introduction

Computational geometry: an introduction
Efficient Implementation of the Fuzzy c-Means Clustering Algorithms

IEEE Transactions on Pattern Analysis and Machine Intelligence
Algorithms in combinatorial geometry

Algorithms in combinatorial geometry
Query processing for distance metrics

Proceedings of the sixteenth international conference on Very large databases
Distance-based indexing for high-dimensional metric spaces

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Materialized views and data warehouses

ACM SIGMOD Record
Processing Complex Similarity Queries with Distance-Based Access Methods

EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Let the Fingers Do the Walking: Object Manipulation in an NF2 Database Editor

New Results and New Trends in Computer Science
Space Filling Curves and Their Use in the Design of Geometric Data Structures

LATIN '95 Proceedings of the Second Latin American Symposium on Theoretical Informatics
The Spatial Locality and a Spatial Indexing Method by Dynamic Clustering in Hypermap System

SSD '91 Proceedings of the Second International Symposium on Advances in Spatial Databases
Monotonous Bisector* Trees - A Tool for Efficient Partitioning of Complex Scenes of Geometric Objects

Data Structures and Efficient Algorithms, Final Report on the DFG Special Joint Initiative
What You See is What You Store: Database-Driven Interfaces

VDB4 Proceedings of the IFIP TC2/WG 2.6 Fourth Working Conference on Visual Database Systems 4
A Data Structure and an Algorithm for the Nearest Point Problem

IEEE Transactions on Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a method for extracting the real dimension of a large data set in a high-dimensional data cube and indicates its use for visual data mining. A similarity measure structures a data set in a general, but weak sense. If the elements are part of a high-dimensional host space (primary space), for instance a data warehouse cube, the resulting structure doesn't necessarily reflect the real dimension of the embedded (secondary) space. Mapping the set into the secondary space of lower dimension will not result in loss of information with regard to the semantics defined by the measure. However, it helps to reduce storage and computing efforts. Additionally, the secondary space itself reveals much about the set's structure and can facilitate data mining. We make a proposal for adding the property of a dimension to a metric and show how to determine the real (in general fractal) dimension of the underlying data set.