An application of the metric access methods to the mass spectrometry data
CIBCB'09 Proceedings of the 6th Annual IEEE conference on Computational Intelligence in Bioinformatics and Computational Biology
Improving the similarity search of tandem mass spectra using metric access methods
Proceedings of the Third International Conference on SImilarity Search and APplications
Proceedings of the Third International Conference on SImilarity Search and APplications
Metric-space search in bioinformatics
SIGSPATIAL Special
Indexing and searching a mass spectrometry database
Algorithms and Applications
Non-metric similarity search of tandem mass spectra including posttranslational modifications
Journal of Discrete Algorithms
Pivot selection: Dimension reduction for distance-based indexing
Journal of Discrete Algorithms
On optimizing the non-metric similarity search in tandem mass spectra by clustering
ISBRA'12 Proceedings of the 8th international conference on Bioinformatics Research and Applications
SimTandem: similarity search in tandem mass spectra
SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining
Hi-index | 3.84 |
Motivation: We reformulate the problem of comparing mass-spectra by mapping spectra to a vector space model. Our search method leverages a metric space indexing algorithm to produce an initial candidate set, which can be followed by any fine ranking scheme. Results: We consider three distance measures integrated into a multi-vantage point index structure. Of these, a semi-metric fuzzy-cosine distance using peptide precursor mass constraints performs the best. The index acts as a coarse, lossless filter with respect to the SEQUEST and ProFound scoring schemes, reducing the number of distance computations and returned candidates for fine filtering to about 0.5% and 0.02% of the database respectively. The fuzzy cosine distance term improves specificity over a peptide precursor mass filter, reducing the number of returned candidates by an order of magnitude. Run time measurements suggest proportional speedups in overall search times. Using an implementation of ProFound's Bayesian score as an example of a fine filter on a test set of Escherichia coli protein fragmentation spectra, the top results of our sample system are consistent with that of SEQUEST. Contact: smriti@cs.utexas.edu Supplementary information: Supplementary data are available at Bioinformatics online.