Pattern change discovery between high dimensional data sets

  • Authors:
  • Yi Xu;Zhongfei Zhang;Philips Yu;Bo Long

  • Affiliations:
  • Binghamton University, Vestal, NY, USA;Binghamton University, Vestal, USA;University of Illinois at Chicago, Chicago, USA;Yahoo! Inc, Sunnyvale, USA

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper investigates the general problem of pattern change discovery between high-dimensional data sets. Current methods either mainly focus on magnitude change detection of low-dimensional data sets or are under supervised frameworks. In this paper, the notion of the principal angles between the subspaces is introduced to measure the subspace difference between two high-dimensional data sets. Principal angles bear a property to isolate subspace change from the magnitude change. To address the challenge of directly computing the principal angles, we elect to use matrix factorization to serve as a statistical framework and develop the principle of the dominant subspace mapping to transfer the principal angle based detection to a matrix factorization problem. We show how matrix factorization can be naturally embedded into the likelihood ratio test based on the linear models. The proposed method is of an unsupervised nature and addresses the statistical significance of the pattern changes between high-dimensional data sets. We have showcased the different applications of this solution in several specific real-world applications to demonstrate the power and effectiveness of this method.