Improved Approximation Algorithms for NMR Spectral Peak Assignment

Authors:
Zhi-Zhong Chen;Tao Jiang;Guohui Lin;Jianjun Wen;Dong Xu;Ying Xu
Affiliations:
-;-;-;-;-;-
Venue:
WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Year:
2002

Citing 4
Cited 2

Introduction to algorithms

Introduction to algorithms
Maximum bounded 3-dimensional matching is MAX SNP-complete

Information Processing Letters
A unified approach to approximating resource allocation and scheduling

Journal of the ACM (JACM)
Nuclear magnetic resonance: automated assignment of backbone NMR peaks using constrained bipartite matching

Computing in Science and Engineering

Automated Protein NMR Resonance Assignments

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
RIBRA–an error-tolerant algorithm for the NMR backbone assignment problem

RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study a constrained bipartite matching problem where the input is a weighted bipartite graph G = (U, V,E), U is a set of vertices following a sequential order, V is another set of vertices partitioned into a collection of disjoint subsets, each following a sequential order, and E is a set of edges between U and V with non-negative weights. The objective is to find a matching in G with the maximum weight that satisfies the given sequential orders on both U and V , i.e., if ui+1 follows ui in U and if vj+1 follows vj in V, then ui is matched with vj if and only if ui+1 is matched with vj+1. The problem has recently been formulated as a crucial step in an algorithmic approach for interpreting NMR spectral data [15]. The interpretation of NMR spectral data is known as a key problem in protein structure determination via NMR spectroscopy. Unfortunately, the constrained bipartite matching problem is NP-hard [15]. We first propose a 2-approximation algorithm for the problem, which follows directly from the recent result of Bar-Noy et al. [2] on interval scheduling. However, our extensive experimental results on real NMR spectral data illustrate that the algorithm performs poorly in terms of recovering the target-matching (i.e. correct) edges. We then propose another approximation algorithm that tries to take advantage of the "density" of the sequential order information in V. Although we are only able to prove an approximation ratio of 3 log2 D for this algorithm, where D is the length of a longest string in V, the experimental results demonstrate that this new algorithm performs much better on real data, i.e. it is able to recover a large fraction of the target-matching edges and the weight of its output matching is often in fact close to the maximum. We also prove that the problem is MAX SNP-hard, even if the input bipartite graph is unweighted. We further present an approximation algorithm for a nontrivial special case that breaks the ratio 2 barrier.