From pull-down data to protein interaction networks and complexes with biological relevance

Authors:
Bing Zhang;Byung-Hoon Park;Tatiana Karpinets;Nagiza F. Samatova
Affiliations:
-;-;-;-
Venue:
Bioinformatics
Year:
2008

Citing 0
Cited 12

A scalable, parallel algorithm for maximal clique enumeration

Journal of Parallel and Distributed Computing
On perturbation theory and an algorithm for maximal clique enumeration in uncertain and noisy graphs

Proceedings of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data
Theoretical underpinnings for maximal clique enumeration on perturbed graphs

Theoretical Computer Science
Finding maximal cliques in massive networks by H*-graph

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Lessons learned from exploring the backtracking paradigm on the GPU

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Finding maximal cliques in massive networks

ACM Transactions on Database Systems (TODS)
Predicting protein complexes in protein interaction networks using a core-attachment algorithm based on graph communicability

Information Sciences: an International Journal
Employing functional interactions for characterisation and detection of sparse complexes from yeast PPI networks

International Journal of Bioinformatics Research and Applications
Identifying protein complexes in AP-MS data with negative evidence via soft Markov clustering

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Clustering Coefficients in Protein Interaction Hypernetworks

Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Improving protein complex classification accuracy using amino acid composition profile

Computers in Biology and Medicine
Maximal clique enumeration for large graphs on hadoop framework

Proceedings of the first workshop on Parallel programming for analytics applications

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation:Recent improvements in high-throughput Mass Spectrometry (MS) technology have expedited genome-wide discovery of protein–protein interactions by providing a capability of detecting protein complexes in a physiological setting. Computational inference of protein interaction networks and protein complexes from MS data are challenging. Advances are required in developing robust and seamlessly integrated procedures for assessment of protein–protein interaction affinities, mathematical representation of protein interaction networks, discovery of protein complexes and evaluation of their biological relevance. Results: A multi-step but easy-to-follow framework for identifying protein complexes from MS pull-down data is introduced. It assesses interaction affinity between two proteins based on similarity of their co-purification patterns derived from MS data. It constructs a protein interaction network by adopting a knowledge-guided threshold selection method. Based on the network, it identifies protein complexes and infers their core components using a graph-theoretical approach. It deploys a statistical evaluation procedure to assess biological relevance of each found complex. On Saccharomyces cerevisiae pull-down data, the framework outperformed other more complicated schemes by at least 10% in F1-measure and identified 610 protein complexes with high-functional homogeneity based on the enrichment in Gene Ontology (GO) annotation. Manual examination of the complexes brought forward the hypotheses on cause of false identifications. Namely, co-purification of different protein complexes as mediated by a common non-protein molecule, such as DNA, might be a source of false positives. Protein identification bias in pull-down technology, such as the hydrophilic bias could result in false negatives. Contact: samatovan@ornl.gov Supplementary information: Supplementary data are available at Bioinformatics online.