Multi-view clustering via canonical correlation analysis

Authors:
Kamalika Chaudhuri;Sham M. Kakade;Karen Livescu;Karthik Sridharan
Affiliations:
ITA, UC San Diego, La Jolla, CA;Toyota Technological Institute at Chicago, Chicago, IL;Toyota Technological Institute at Chicago, Chicago, IL;Toyota Technological Institute at Chicago, Chicago, IL
Venue:
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Year:
2009

Citing 10
Cited 23

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
A Spectral Algorithm for Learning Mixtures of Distributions

FOCS '02 Proceedings of the 43rd Symposium on Foundations of Computer Science
A Two-Round Variant of EM for Gaussian Mixtures

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Learning Mixtures of Gaussians

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Sampling from large matrices: An approach through geometric functional analysis

Journal of the ACM (JACM)
Two-view feature generation model for semi-supervised learning

Proceedings of the 24th international conference on Machine learning
Isotropic PCA and Affine-Invariant Clustering

FOCS '08 Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science
Multi-view regression via canonical correlation analysis

COLT'07 Proceedings of the 20th annual conference on Learning theory
The spectral method for general mixture models

COLT'05 Proceedings of the 18th annual conference on Learning Theory
On spectral learning of mixtures of distributions

COLT'05 Proceedings of the 18th annual conference on Learning Theory

Multi-view clustering of multilingual documents

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Multi-view clustering with constraint propagation for learning with an incomplete mapping between views

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Exploiting tag and word correlations for improved webpage clustering

SMUC '10 Proceedings of the 2nd international workshop on Search and mining user-generated contents
Multiple view clustering using a weighted combination of exemplar-based mixture models

IEEE Transactions on Neural Networks
A novel ensemble construction method for multi-view data using random cross-view correlation between within-class examples

Pattern Recognition
Multitask Bregman clustering

Neurocomputing
Fusing heterogeneous modalities for video and image re-ranking

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
WikiTopics: what is popular on Wikipedia and why

WASDGML '11 Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages
Using canonical correlation analysis for generalized sentiment analysis, product recommendation and search

Proceedings of the fifth ACM conference on Recommender systems
A unified dimensionality reduction framework for semi-paired and semi-supervised multi-view data

Pattern Recognition
Dimensionality reduction by Mixed Kernel Canonical Correlation Analysis

Pattern Recognition
Community detection via heterogeneous interaction analysis

Data Mining and Knowledge Discovery
Clustering in applications with multiple data sources-A mutual subspace clustering approach

Neurocomputing
Leveraging Social Bookmarks from Partially Tagged Corpus for Improved Web Page Clustering

ACM Transactions on Intelligent Systems and Technology (TIST)
Integrating social media data for community detection

MSM'11 Proceedings of the 2011 international conference on Modeling and Mining Ubiquitous Social Media
Regularized nonnegative shared subspace learning

Data Mining and Knowledge Discovery
Co-regularized PLSA for multi-view clustering

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part II
Flexible and robust co-regularized multi-domain graph clustering

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Large-margin multi-view Gaussian process for image classification

Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service
Neighborhood Correlation Analysis for Semi-paired Two-View Data

Neural Processing Letters
Learning canonical correlations of paired tensor sets via tensor-to-vector projection

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Fractional-order embedding canonical correlation analysis and its applications to multi-view dimensionality reduction and recognition

Pattern Recognition
Comment-based multi-view clustering of web 2.0 items

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering data in high dimensions is believed to be a hard problem in general. A number of efficient clustering algorithms developed in recent years address this problem by projecting the data into a lower-dimensional subspace, e.g. via Principal Components Analysis (PCA) or random projections, before clustering. Here, we consider constructing such projections using multiple views of the data, via Canonical Correlation Analysis (CCA). Under the assumption that the views are un-correlated given the cluster label, we show that the separation conditions required for the algorithm to be successful are significantly weaker than prior results in the literature. We provide results for mixtures of Gaussians and mixtures of log concave distributions. We also provide empirical support from audio-visual speaker clustering (where we desire the clusters to correspond to speaker ID) and from hierarchical Wikipedia document clustering (where one view is the words in the document and the other is the link structure).