Mining Hierarchies of Correlation Clusters

  • Authors:
  • Elke Achtert;Christian Bohm;Peer Kroger;Arthur Zimek

  • Affiliations:
  • University of Munich, Germany;University of Munich, Germany;University of Munich, Germany;University of Munich, Germany

  • Venue:
  • SSDBM '06 Proceedings of the 18th International Conference on Scientific and Statistical Database Management
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The detection of correlations between different features in high dimensional data sets is a very important data mining task. These correlations can be arbitrarily complex: One or more features might be correlated with several other features, and both noise features as well as the actual dependencies may be different for different clusters. Therefore, each cluster contains points that are located on a common hyperplane of arbitrary dimensionality in the data space and thus generates a separate, arbitrarily oriented subspace of the original data space. The few recently proposed algorithms designed to uncover these correlation clusters have several disadvantages. In particular, these methods cannot detect correlation clusters of different dimensionality which are nested into each other. The complete hierarchical structure of correlation clusters of varying dimensionality can only be detected by a hierarchical clustering approach. Therefore, we propose the algorithm HiCO (Hierarchical Correlation Ordering), the first hierarchical approach to correlation clustering. The algorithm determines the cluster hierarchy, and visualizes it using correlation diagrams. Several comparative experiments using synthetic and real data sets show the performance and the effectivity of HiCO.