Improving the similarity search of tandem mass spectra using metric access methods

  • Authors:
  • Jiří Novák;Tomáš Skopal;David Hoksza;Jakub Lokoč

  • Affiliations:
  • Charles University in Prague, Prague, Czech Republic;Charles University in Prague, Prague, Czech Republic;Charles University in Prague, Prague, Czech Republic;Charles University in Prague, Prague, Czech Republic

  • Venue:
  • Proceedings of the Third International Conference on SImilarity Search and APplications
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In biological applications, the tandem mass spectrometry is a widely used method for determining protein and peptide sequences from an "in vitro" sample. The sequences are not determined directly, but they must be interpreted from the mass spectra, which is the output of the mass spectrometer. This work is focused on a similarity-search approach to mass spectra interpretation, where the parametrized Hausdorff distance (dHP) is used as the similarity. In order to provide an efficient similarity search under dHP, the metric access methods and the TriGen algorithm (controlling the metricity of dHP) are employed. We show that similarity search using dHP exhibits better correctness of peptide mass spectra interpretation than the cosine similarity commonly mentioned in mass spectrometry literature. Moreover, the search model using the dHP distance could be extended to support chemical modifications in the query mass spectra, which is typically a problem when the cosine similarity is used. Our approach can be utilized as a coarse filter by any other database approach for mass spectra interpretation.