Protein sequences identification using NM-tree

  • Authors:
  • Jiří Novák;Tomáš Skopal;David Hoksza;Jakub Lokoč;Jakub Galgonek

  • Affiliations:
  • Charles University in Prague;Charles University in Prague;Charles University in Prague;Charles University in Prague;Charles University in Prague

  • Venue:
  • Proceedings of the Fourth International Conference on SImilarity Search and APplications
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We have generalized a method for tandem mass spectra interpretation, based on the parameterized Hausdorff distance dHP. Instead of just peptides (short pieces of proteins), in this paper we describe the interpretation of whole protein sequences. For this purpose, we employ the recently introduced NM-tree to index the database of hypothetical mass spectra for exact or fast approximate search. The NM-tree combines the M-tree with the TriGen algorithm in a way that allows to dynamically control the retrieval precision at query time. A scheme for protein sequences identification using the NM-tree is proposed.