Faster Mass Spectrometry-Based Protein Inference: Junction Trees Are More Efficient than Sampling and Marginalization by Enumeration

Authors:
Oliver Serang;William Stratford Noble
Affiliations:
Harvard Medical School and Children's Hospital Boston;University of Washington, Seattle
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2012

Citing 5
Cited 0

Complexity of finding embeddings in a k-tree

SIAM Journal on Algebraic and Discrete Methods
HUGIN—a shell for building Bayesian belief universes for expert systems

Readings in uncertain reasoning
Correctness of Local Probability Propagation in Graphical Models with Loops

Neural Computation
A hierarchical statistical model to assess the confidence of peptides and proteins inferred from tandem mass spectrometry

Bioinformatics
A Bayesian approach to protein inference problem in shotgun proteomics

RECOMB'08 Proceedings of the 12th annual international conference on Research in computational molecular biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of identifying the proteins in a complex mixture using tandem mass spectrometry can be framed as an inference problem on a graph that connects peptides to proteins. Several existing protein identification methods make use of statistical inference methods for graphical models, including expectation maximization, Markov chain Monte Carlo, and full marginalization coupled with approximation heuristics. We show that, for this problem, the majority of the cost of inference usually comes from a few highly connected subgraphs. Furthermore, we evaluate three different statistical inference methods using a common graphical model, and we demonstrate that junction tree inference substantially improves rates of convergence compared to existing methods. The python code used for this paper is available at http://noble.gs.washington.edu/proj/fido.