High-Throughput 3D Structural Homology Detection via NMR Resonance Assignment

Authors:
Christopher James Langmead;Bruce Randall Donald
Affiliations:
Carnegie Mellon University;Dartmouth University
Venue:
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Year:
2004

Citing 5
Cited 3

Graph Matching With a Dual-Step EM Algorithm

IEEE Transactions on Pattern Analysis and Machine Intelligence
Protein structure determination using protein threading and sparse NMR data (extended abstract)

RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Nuclear magnetic resonance: automated assignment of backbone NMR peaks using constrained bipartite matching

Computing in Science and Engineering
Large a polynomial-time nuclear vector replacement algorithm for automated NMR resonance assignments

RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
3D Structural Homology Detection via Unassigned Residual Dipolar Couplings

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics

Protein loop closure using orientational restraints from NMR data

RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
A markov random field framework for protein side-chain resonance assignment

RECOMB'10 Proceedings of the 14th Annual international conference on Research in Computational Molecular Biology
An ant colony optimization approach for solving the nuclear magnetic resonance structure based assignment problem

Proceedings of the 15th annual conference companion on Genetic and evolutionary computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

One goal of the structural genomics initiative is the identification of new protein folds. Sequence-based structural homology prediction methods are an important means for prioritizing unknown proteins for structure determination. However, an important challenge remains: two highly dissimilar sequences can have similar folds & how can we detect this rapidly, in the context of structural genomics? High-throughput NMR experiments, coupled with novel algorithms for data analysis, can address this challenge. We report an automated procedure, called HD, for detecting 3D structural homologies from sparse, unassigned protein NMR data. Our method identifies 3D models in a protein structural database whose geometries best fit the unassigned experimental NMR data. HD does not use, and is thus not limited by sequence homology. The method can also be used to confirm or refute structural predictions made by other techniques such as protein threading or homology modelling. The algorithm runs in O(pn + pn^5/2 log (cn)+p log p) time, where p is the number of proteins in the database, n is the number of residues in the target protein and c is the maximum edge weight in an integer-weighted bipartite graph. Our experiments on real NMR data from 3 different proteins against a database of 4,500 representative folds demonstrate that the method identifies closely related protein folds, including sub-domains of larger proteins, with as little as 10-30% sequence homology between the target protein (or sub-domain) and the computed model. In particular, we report no false-negatives or false-positives despite significant percentages of missing experimental data.