Mining interlacing manifolds in high dimensional spaces

  • Authors:
  • Tao Ban;Changshui Zhang;Shigeo Abe;Takeshi Takahashi;Youki Kadobayashi

  • Affiliations:
  • National Institute of Information and Communications Technology, Tokyo, Japan;Tsinghua University, Beijing, China;Kobe University, Kobe, Japan;National Institute of Information and Communications Technology, Tokyo, Japan;National Institute of Information and Communications Technology, Tokyo, Japan

  • Venue:
  • Proceedings of the 2011 ACM Symposium on Applied Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Real world data are often composed of conceptually meaningful subspaces, e.g., for portraits in a facial image database, the illumination factor corresponds to a nonlinear subspace and the rotation factor corresponds to another. The interlacement of these subspaces may greatly increase the complexity of the data and impede our understanding and further processing. To identify interlacing subspaces and extract the essential structural knowledge from a given dataset, we present a novel approach termed Multi-Manifold Partition (MMP). Global manifolds that corresponds to conceptually meaningful subspaces are discovered in three steps: First, a neighborhood graph is built to capture the intrinsic topological structure of the input data; then, the uniformity of neighboring nodes is analyzed and segments of manifolds are created by connecting adjacent samples that are conform in dimension; finally, segments possibly from the same manifold are combined to obtain a global representation of underlying subspaces. Experimental results on two synthetic datasets and a practical Optical Character Recognition (OCR) problem show that MMP is effective in extracting interlacing structures and thus offers us better interpretation of the nature of the data.