Semi-supervised learning with multiple views

  • Authors:
  • Peter L. Bartlett;David Stuart Rosenberg

  • Affiliations:
  • University of California, Berkeley;University of California, Berkeley

  • Venue:
  • Semi-supervised learning with multiple views
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In semi-supervised learning, we receive training data comprising labeled points and unlabeled points, and we search for a target function that maps new unlabeled points to their appropriate labels. In the co-regularized least squares (CoRLS) algorithm for semi-supervised learning, we assume that the target function is well-approximated by some function in each of two function classes. We search for a function in each class that performs well on the labeled training data, and we restrict our search space to those pairs of functions, one from each class, whose predictions agree on the unlabeled data. By restricting the search space, we potentially get improved generalization performance for the chosen functions. Our first main contribution is a precise characterization of the size, in terms of Rademacher complexity, of the agreement-constrained search space. We find that the agreement constraint reduces the Rademacher complexity by an amount that depends on the "distance" between the function classes, as measured by a data-dependent metric. Experimentally, we find that the amount of reduction in complexity introduced by the agreement constraint correlates with the amount of improvement that the constraint gives in the CoRLS algorithm. We next present a new framework for multi-view learning called multi-view point cloud regularization (MVPCR), which has both CoRLS and the popular "manifold regularization" algorithms as special cases. MVPCR is initially formulated as an optimization problem over multiple reproducing kernel Hilbert spaces (RKHSs), where the objective function involves both the labeled and unlabeled data. Our second main result is the construction of a single RKHS, with a data-dependent norm, that reduces the original optimization problem to a supervised learning problem in a single RKHS. Using the multi-view kernel corresponding to this new RKHS, we can easily convert any standard supervised kernel method into a semi-supervised, multi-view method. The multi-view kernel also allows us to refine our CoRLS generalization bound using localized Rademacher complexity theory. As a practical application of our new framework, we present the manifold co-regularization algorithm, which leads to empirical improvements over manifold regularization on several semi-supervised tasks.