Randomized algorithms
RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
An Efficient Branch-and-Bound Algorithm for the Assignment of Protein Backbone NMR Peaks
CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
On a simple randomized algorithm for finding a 2-factor in sparse graphs
Information Processing Letters
Generating Graphs for Visual Analytics through Interactive Sketching
IEEE Transactions on Visualization and Computer Graphics
CISA: Combined NMR Resonance Connectivity Information Determination and Sequential Assignment
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
An approximation algorithm for a bottleneck traveling salesman problem
Journal of Discrete Algorithms
An assignment walk through 3D NMR spectrum
CIBCB'09 Proceedings of the 6th Annual IEEE conference on Computational Intelligence in Bioinformatics and Computational Biology
On a simple randomized algorithm for finding a 2-factor in sparse graphs
Information Processing Letters
An approximation algorithm for a bottleneck traveling salesman problem
CIAC'06 Proceedings of the 6th Italian conference on Algorithms and Complexity
RIBRA–an error-tolerant algorithm for the NMR backbone assignment problem
RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology
Hi-index | 0.00 |
Nuclear magnetic resonance (NMR) spectroscopy allows scientists to study protein structure, dynamics, and interactions in solution. A necessary first step for such applications is determining the resonance assignment, mapping spectral data to atoms and residues in the primary sequence. Automated resonance assignment algorithms rely on information regarding connectivity (e.g. through-bond atomic interactions) and amino acid type, typically using the former to determine strings of connected residues and the latter to map those strings to positions in the primary sequence. Significant ambiguity exists in both connectivity and amino acid type, and different algorithms have combined the information in two phases (find short unambiguous strings then align) or simultaneously (align while extending strings). This paper focuses on the information content available in connectivity alone, allowing for ambiguity rather than handling only unambiguous strings, and complements existing work on the information content in amino acid type.In this paper, we develop a novel random-graph theoretic framework for algorithmic analysis of NMR sequential assignment. Our random graph model captures the structure of chemical shift degeneracy (a key source of connectivity ambiguity). We then give a simple and natural randomized algorithm for finding an optimum sequential cover. The algorithm naturally and efficiently reuses substrings while exploring connectivity choices; it overcomes local ambiguity by enforcing global consistency of all choices. We employ our random graph model to analyze our algorithm, and show that it can provably tolerate a relatively large ambiguity while still giving expected optimal performance in polynomial time. To study the algorithm's performance in practice, we tested it on experimental data sets from a variety of proteins and experimental set-ups. The algorithm was able to overcome significant noise and local ambiguity and consistently identify significant sequential fragments.