A fast coarse filtering method for peptide identification by mass spectrometry

Authors:
Smriti R. Ramakrishnan;Rui Mao;Aleksey A. Nakorchevskiy;John T. Prince;Willard S. Willard;Weijia Xu;Edward M. Marcotte;Daniel P. Miranker
Affiliations:
Department of Computer Sciences, The University of Texas at Austin Austin, Texas 78712, USA;Department of Computer Sciences, The University of Texas at Austin Austin, Texas 78712, USA;Department of Chemistry and Biochemistry, The University of Texas at Austin Austin, Texas 78712, USA;Institute for Cellular and Molecular Biology, The University of Texas at Austin Austin, Texas 78712, USA;Department of Computer Sciences, The University of Texas at Austin Austin, Texas 78712, USA;Department of Computer Sciences, The University of Texas at Austin Austin, Texas 78712, USA;Institute for Cellular and Molecular Biology, The University of Texas at Austin Austin, Texas 78712, USA;Department of Computer Sciences, The University of Texas at Austin Austin, Texas 78712, USA
Venue:
Bioinformatics
Year:
2006

Citing 0
Cited 10

An application of the metric access methods to the mass spectrometry data

CIBCB'09 Proceedings of the 6th Annual IEEE conference on Computational Intelligence in Bioinformatics and Computational Biology
Improving the similarity search of tandem mass spectra using metric access methods

Proceedings of the Third International Conference on SImilarity Search and APplications
An inverted index for mass spectra similarity query and comparison with a metric-space method: case study

Proceedings of the Third International Conference on SImilarity Search and APplications
Metric-space search in bioinformatics

SIGSPATIAL Special
Indexing and searching a mass spectrometry database

Algorithms and Applications
Non-metric similarity search of tandem mass spectra including posttranslational modifications

Journal of Discrete Algorithms
Pivot selection: Dimension reduction for distance-based indexing

Journal of Discrete Algorithms
On optimizing the non-metric similarity search in tandem mass spectra by clustering

ISBRA'12 Proceedings of the 8th international conference on Bioinformatics Research and Applications
SimTandem: similarity search in tandem mass spectra

SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
A high performance algorithm for clustering of large-scale protein mass spectrometry data using multi-core architectures

Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: We reformulate the problem of comparing mass-spectra by mapping spectra to a vector space model. Our search method leverages a metric space indexing algorithm to produce an initial candidate set, which can be followed by any fine ranking scheme. Results: We consider three distance measures integrated into a multi-vantage point index structure. Of these, a semi-metric fuzzy-cosine distance using peptide precursor mass constraints performs the best. The index acts as a coarse, lossless filter with respect to the SEQUEST and ProFound scoring schemes, reducing the number of distance computations and returned candidates for fine filtering to about 0.5% and 0.02% of the database respectively. The fuzzy cosine distance term improves specificity over a peptide precursor mass filter, reducing the number of returned candidates by an order of magnitude. Run time measurements suggest proportional speedups in overall search times. Using an implementation of ProFound's Bayesian score as an example of a fine filter on a test set of Escherichia coli protein fragmentation spectra, the top results of our sample system are consistent with that of SEQUEST. Contact: smriti@cs.utexas.edu Supplementary information: Supplementary data are available at Bioinformatics online.