A collective NMF method for detecting protein functional module from multiple data sources

Authors:
Yuan Zhang;Nan Du;Liang Ge;Kebin Jia;Aidong Zhang
Affiliations:
Beijing University of Technology, Beijing, China;State University of New York at Buffalo, Buffalo;State University of New York at Buffalo, Buffalo;Beijing University of Technology, Beijing, China;State University of New York at Buffalo, Buffalo
Venue:
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Year:
2012

Citing 10
Cited 0

An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Solving cluster ensemble problems by bipartite graph partitioning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Efficient Modularization of Weighted Protein Interaction Networks using k-Hop Graph Reduction

BIBE '06 Proceedings of the Sixth IEEE Symposium on BionInformatics and BioEngineering
A tutorial on spectral clustering

Statistics and Computing
Nonnegative Matrix Factorization Based on Alternating Nonnegativity Constrained Least Squares and Active Set Method

SIAM Journal on Matrix Analysis and Applications
Clustering complex networks and biological networks by nonnegative matrix factorization with various similarity measures

Neurocomputing
Non-negative matrix factorization for semi-supervised data clustering

Knowledge and Information Systems
flowNet: Flow-Based Approach for Efficient Analysis of Complex Biological Networks

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
A Fast Hierarchical Clustering Algorithm for Functional Modules Discovery in Protein Interaction Networks

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Detecting functional modules from protein-protein interaction (PPI) networks is an active research area with many practical applications. However, there is always a critical concern on the false PPI interactions which are derived from the high-throughput experiments and the unsatisfactory results obtained from single PPI network with severe information insufficiency. To address this problem, we propose a Collective Non-negative Matrix Factorization (CoNMF) based soft clustering method which efficiently integrates information of gene ontology (GO), gene expression data and PPI networks. In our method, the three data sources are formed into two graphs with similarity adjacency matrices and these graphs are approximated by a matrix factorization with their common factor which provides the straight-forward interpretation of clustering results. Extensive experiments show that we can improve the module detection performance by integrating multiple biological data sources and that CoNMF yields superior results compared to other multiple data sources fusion methods by identifying a larger number of more precise protein modules with actual biological meaning and certain degree of overlapping.