Estimation of Subspace Arrangements with Applications in Modeling and Segmenting Mixed Data

  • Authors:
  • Yi Ma;Allen Y. Yang;Harm Derksen;Robert Fossum

  • Affiliations:
  • yima@uiuc.edu;yang@eecs.berkeley.edu;hderksen@umich.edu;rmfossum@uiuc.edu

  • Venue:
  • SIAM Review
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently many scientific and engineering applications have involved the challenging task of analyzing large amounts of unsorted high-dimensional data that have very complicated structures. From both geometric and statistical points of view, such unsorted data are considered mixed as different parts of the data have significantly different structures which cannot be described by a single model. In this paper we propose to use subspace arrangements—a union of multiple subspaces—for modeling mixed data: each subspace in the arrangement is used to model just a homogeneous subset of the data. Thus, multiple subspaces together can capture the heterogeneous structures within the data set. In this paper, we give a comprehensive introduction to a new approach for the estimation of subspace arrangements. This is known as generalized principal component analysis (GPCA). In particular, we provide a comprehensive summary of important algebraic properties and statistical facts that are crucial for making the inference of subspace arrangements both efficient and robust, even when the given data are corrupted by noise or contaminated with outliers. This new method in many ways improves and generalizes extant methods for modeling or clustering mixed data. There have been successful applications of this new method to many real-world problems in computer vision, image processing, and system identification. In this paper, we will examine several of those representative applications. This paper is intended to be expository in nature. However, in order that this may serve as a more complete reference for both theoreticians and practitioners, we take the liberty of filling in several gaps between the theory and the practice in the existing literature.