High-Dimensional Multimodal Distribution Embedding

  • Authors:
  • Eniko Szekely;Eric Bruno;Stephane Marchand-Maillet

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDMW '10 Proceedings of the 2010 IEEE International Conference on Data Mining Workshops
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

High-dimensional data is emerging in more and more varied domains, but its analysis has revealed to be difficult due to the curse of dimensionality. Dimension reduction emerged as a powerful tool in overcoming problems related to high-dimensionality, still the curse of dimensionality continues to impact many of the existing methods. The current paper concentrates on low-dimensional distance-based embeddings for high-dimensional multimodal distributions, i.e. clustered data. Pair wise distances are particularly influenced by high-dimensionality. Their analysis is at the basis of the embedding method presented here and called HDME. To avoid the problems of high-dimensionality, HDME performs a distance transformation based on interpoint relationships. The positive influence of the transformation in preserving and emphasizing clusters is first demonstrated using label information. The distance transformation is driven by the estimation of the neighbourhood information. The transformed distances are embedded in a low-dimensional space using a classical embedding method. Experiments on real-world data show that distance transformations can be effectively used in conjunction with distance-based embedding methods to obtain representation spaces that well discriminate clusters.