Indexed Hierarchical Approximate String Matching

  • Authors:
  • Luís M. Russo;Gonzalo Navarro;Arlindo L. Oliveira

  • Affiliations:
  • INESC-ID, Lisboa, Portugal 1000 and CITI, Departamento de Informática, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, Portugal;Dept. of Computer Science, University of Chile,;INESC-ID, Lisboa, Portugal 1000 and Instituto Superior Técnico, Universidade Técnica de Lisboa, Portugal

  • Venue:
  • SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a new search procedure for approximate string matching over suffix trees. We show that hierarchical verification, which is a well-established technique for on-line searching, can also be used with an indexed approach. For this, we need that the index supports bidirectionality, meaning that the search for a pattern can be updated by adding a letter at the right or at the left. This turns out to be easily supported by most compressed text self-indexes, which represent the index and the text essentially in the same space of the compressed text alone. To complete the symbiotic exchange, our hierarchical verification largely reduces the need to access the text, which is expensive in compressed text self-indexes. The resulting algorithm can, in particular, run over an existing fully compressed suffix tree, which makes it very appealing for applications in computational biology. We compare our algorithm with related approaches, showing that our method offers an interesting space/time tradeoff, and in particular does not need of any parameterization, which is necessary in the most successful competing approaches.