Computing matching statistics and maximal exact matches on compressed full-text indexes

  • Authors:
  • Enno Ohlebusch;Simon Gog;Adrian Kügell

  • Affiliations:
  • Institute of Theoretical Computer Science, University of Ulm, Ulm;Institute of Theoretical Computer Science, University of Ulm, Ulm;Institute of Theoretical Computer Science, University of Ulm, Ulm

  • Venue:
  • SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Exact string matching is a problem that computer programmers face on a regular basis, and full-text indexes like the suffix tree or the suffix array provide fast string search over large texts. In the last decade, research on compressed indexes has flourished because the main problem in large-scale applications is the space consumption of the index. Nowadays, the most successful compressed indexes are able to obtain almost optimal space and search time simultaneously. It is known that a myriad of sequence analysis and comparison problems can be solved efficiently with established data structures like the suffix tree or the suffix array, but algorithms on compressed indexes that solve these problem are still lacking at present. Here, we show that matching statistics and maximal exact matches between two strings S1 and S2 can be computed efficiently by matching S2 backwards against a compressed index of S1.