Neighborhood Correlation Analysis for Semi-paired Two-View Data

  • Authors:
  • Xudong Zhou;Xiaohong Chen;Songcan Chen

  • Affiliations:
  • College of Computer Science and Technology, Nanjing University of Aeronautics & Astronautics, Nanjing, China 210016 and Information Engineering College, Yangzhou University, Yangzhou, China 225127;College of Science, Nanjing University of Aeronautics & Astronautics, Nanjing, China 210016;College of Computer Science and Technology, Nanjing University of Aeronautics & Astronautics, Nanjing, China 210016 and National Key Laboratory for Novel Software Technology, Nanjing University, N ...

  • Venue:
  • Neural Processing Letters
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Canonical correlation analysis (CCA) is a widely used technique for analyzing two datasets (two views of the same objects). However, CCA needs that the samples of the two views are fully-paired. Actually, we are often faced up with the semi-paired scenario where the number of available paired samples is limited and yet the number of unpaired samples is sufficient. For such a scenario, CCA is generally prone to overfitting and thus performs poorly, since its definition itself makes it only able to utilize those paired samples. To overcome such a shortcoming, several semi-paired variants of CCA have been proposed. However, unpaired samples in these methods are just used in the way of single-view leaning to capture individual views' structure information for regularizing CCA. Intuitively, using unpaired samples in the way of two-view learning should be more natural and more attractive since CCA itself is a two-view learning method. As a result, a novel CCAs semi-paired variant named Neighborhood Correlation Analysis (NeCA), which uses unpaired samples in the two-view learning way, is developed through incorporating between-view neighborhood relationships into CCA. The relationships are acquired through leveraging within-view neighborhood relationships of each view's all data (including paired and unpaired data) and between-view paired information. Thus, it can take more sufficient advantage of the unpaired samples and then mitigate overfitting effectively caused by the limited paired data. Promising experiments results on several popular multi-view datasets show its feasibility and effectiveness.