Approximation algorithms for NMR spectral peak assignment

Authors:
Zhi-Zhong Chen;Tao Jiang;Guohui Lin;Jianjun Wen;Dong Xu;Jinbo Xu;Ying Xu
Affiliations:
Department of Mathematical Sciences, Tokyo Denki University, Hatoyama, Saitama 350-0394, Japan;Department of Computer Science, University of California, Riverside, CA;Department of Computing Science, University of Alberta, Edmonton, Alberta T6G 2E8, Canada;Department of Computer Science, University of California, Riverside, CA;Protein Informatics Group, Oak Ridge National Laboratory, Oak Ridge, TN;Department of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada;Protein Informatics Group, Oak Ridge National Laboratory, Oak Ridge, TN
Venue:
Theoretical Computer Science
Year:
2003

Citing 6
Cited 6

Introduction to algorithms

Introduction to algorithms
Maximum bounded 3-dimensional matching is MAX SNP-complete

Information Processing Letters
Approximating discrete collections via local improvements

Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
Greedy local improvement and weighted set packing approximation

Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
A unified approach to approximating resource allocation and scheduling

STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Nuclear magnetic resonance: automated assignment of backbone NMR peaks using constrained bipartite matching

Computing in Science and Engineering

An Efficient Branch-and-Bound Algorithm for the Assignment of Protein Backbone NMR Peaks

CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Automated Protein NMR Resonance Assignments

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
An Efficient and Accurate Algorithm for Assigning Nuclear Overhauser Effect Restraints Using a Rotamer Library Ensemble and Residual Dipolar Couplings

CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
An approximation algorithm for a bottleneck traveling salesman problem

Journal of Discrete Algorithms
An approximation algorithm for a bottleneck traveling salesman problem

CIAC'06 Proceedings of the 6th Italian conference on Algorithms and Complexity
RIBRA–an error-tolerant algorithm for the NMR backbone assignment problem

RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology

Quantified Score

Hi-index	5.23

Visualization

Abstract

We study a constrained bipartite matching problem where the input is a weighted bipartite graph G = (U, V, E), U is a set of vertices following a sequential order, V is another set of vertices partitioned into a collection of disjoint subsets, each following a sequential order, and E is a set of edges between U and V with non-negative weights. The objective is to find a matching in G with the maximum weight that satisfies the given sequential orders on both U and V, i.e. if ui+1 follows ui in U and if Vj+1 follows vj in V, then ui is matched with vj if and only if ui+1 is matched with vj+1. The problem has recently been formulated as a crucial step in an algorithmic approach for interpreting NMR spectral data (IEEE Comput Sci. Eng. 4 (2002) 50-62). The interpretation of NMR spectral data is known as a key problem in protein structure determination via NMR spectroscopy. Unfortunately, the constrained bipartite matching problem is NP-hard (IEEE Comput. Sci. Eng. 4 (2002) 50-62). We first propose a 2-approximation algorithm for the problem, which follows directly from the recent result of Bar-Noy et al. (Proc. 32nd ACM Symp. on Theory of Computing (STOC'00), 2000, pp. 735 -744) on interval scheduling. However, our extensive experimental results on real NMR spectral data illustrate that the algorithm perform poorly in terms of recovering target-matching edges. We then propose another approximation algorithm that tries to take advantage of the "density" of the sequential order information in V. Although we are only able to prove an approximation ratio of 3 log2D for this algorithm, where D is the length of a longest string in V, the experimental results demonstrate that this new algorithm performs much better on real data, i.e. it is able to recover a large fraction of target-matching edges and the weight of its output matching is often in fact close to the maximum. We also prove that the problem is MAX SNP-hard, even if the input bipartite graph is unweighted. We further present an approximation algorithm for a nontrivial special case that breaks the ratio 2 barrier.