The NOESY jigsaw: automated protein secondary structure and main-chain assignment from sparse, unassigned NMR data

Authors:
Chris Bailey-Kellogg;Alik Widge;John J. Kelley, III;Marcelo J. Berardi;John H. Bushweller;Bruce Randall Donald
Affiliations:
Dartmouth Computer Science Department, Hanover, NH;Dartmouth Computer Science Department, Hanover, NH;Dartmouth Computer Science Department, Hanover, NH and Dartmouth Chemistry Department, Hanover, NH;Dartmouth Chemistry Department, Hanover, NH;Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA;Dartmouth Computer Science Department, Hanover, NH and 6211 Sudikoff Laboratory, Dartmouth Computer Science Department, Hanover, NH
Venue:
RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Year:
2000

Citing 4
Cited 4

Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
Identifying gene regulatory networks from experimental data

RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
An algorithm for clustering cDNAs for gene expression analysis

RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
Algorithms for choosing differential gene expression experiments

RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology

Extracting structural information using time-frequency analysis of protein NMR data

RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
A random graph approach to NMR sequential assignment

RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Protein similarity from knot theory and geometric convolution

RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Combining ambiguous chemical shift mapping with structure-based backbone and NOE assignment from 15N-NOESY

Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine

Quantified Score

Hi-index	0.00

Visualization

Abstract

High-throughput, data-directed computational protocols for Structural Genomics (or Proteomics) are required in order to evaluate the protein products of genes for structure and function at rates comparable to current gene-sequencing technology. This paper presents the JIGSAW algorithm, a novel high-throughput, automated approach to protein structure characterization with nuclear magnetic resonance (NMR). JIGSAW applies graph algorithms and probabilistic reasoning techniques, enforcing first-principles consistency rules in order to overcome a 5-10% signal-to-noise ratio. It consists of two main components: (1) graph-based secondary structure pattern identification in unassigned heteronuclear NMR data, and (2) assignment of spectral peaks by probabilistic alignment of identified secondary structure elements against the primary sequence. JIGSAW's deferment of assignment until after secondary structure identification differs greatly from traditional approaches, which begin by correlating peaks among dozens of experiments. By deferring assignment, JIGSAW not only eliminates this bottleneck, it also allows the number of experiments to be reduced from dozens to four, none of which requires 13 C-labeled protein. This in turn dramatically reduces the amount and expense of wet lab molecular biology for protein expression and purification, as well as the total spectrometer time to collect data.Our results for three test proteins demonstrate that we are able to identify and align approximately 80 percent of &agr;-helical and 60 percent of &bgr;-sheet structure. JIGSAW is very fast, running in minutes on a Pentium-class Linux workstation. This approach yields quick and reasonably accurate (as opposed to the traditional slow and extremely accurate) structure calculations, utilizing a suite of graph analysis algorithms to compensate for the data sparseness. JIGSAW could be used for quick structural assays to speed data to the biologist early in the process of investigation, and could in principle be applied in an automation-like fashion to a large fraction of the proteome.