Fast relative lempel-ziv self-index for similar sequences

  • Authors:
  • Huy Hoang Do;Jesper Jansson;Kunihiko Sadakane;Wing-Kin Sung

  • Affiliations:
  • National University of Singapore, COM 1, Singapore;Ochanomizu University, Tokyo, Japan;National Institute of Informatics, Tokyo, Japan;National University of Singapore, COM 1, Singapore

  • Venue:
  • FAW-AAIM'12 Proceedings of the 6th international Frontiers in Algorithmics, and Proceedings of the 8th international conference on Algorithmic Aspects in Information and Management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent advances in biotechnology and web technology are generating huge collections of similar strings. People now face the problem of storing them compactly while supporting fast pattern searching. One compression scheme called relative Lempel-Ziv compression uses textual substitutions from a reference text as follows: Given a (large) set S of strings, represent each string in S as a concatenation of substrings from a reference string R . This basic scheme gives a good compression ratio when every string in S is similar to R , but does not provide any pattern searching functionality. Here, we describe a new data structure that supports fast pattern searching.