A Matrix Factorization Approach for Integrating Multiple Data Views

Authors:
Derek Greene;Pádraig Cunningham
Affiliations:
School of Computer Science & Informatics, University College Dublin,;School of Computer Science & Informatics, University College Dublin,
Venue:
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Year:
2009

Citing 7
Cited 3

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Kernel k-means: spectral clustering and normalized cuts

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-View Clustering

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
SVD based initialization: A head start for nonnegative matrix factorization

Pattern Recognition
Producing accurate interpretable clusters from high-dimensional data

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases

Multiple view clustering using a weighted combination of exemplar-based mixture models

IEEE Transactions on Neural Networks
Co-regularized PLSA for multi-view clustering

ACCV'12 Proceedings of the 11th Asian conference on Computer Vision - Volume Part II
Comment-based multi-view clustering of web 2.0 items

Proceedings of the 23rd international conference on World wide web

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many domains there will exist different representations or "views" describing the same set of objects. Taken alone, these views will often be deficient or incomplete. Therefore a key problem for exploratory data analysis is the integration of multiple views to discover the underlying structures in a domain. This problem is made more difficult when disagreement exists between views. We introduce a new unsupervised algorithm for combining information from related views, using a late integration strategy. Combination is performed by applying an approach based on matrix factorization to group related clusters produced on individual views. This yields a projection of the original clusters in the form of a new set of "meta-clusters" covering the entire domain. We also provide a novel model selection strategy for identifying the correct number of meta-clusters. Evaluations performed on a number of multi-view text clustering problems demonstrate the effectiveness of the algorithm.