Motifs in Ziv-Lempel-Welch Clef

  • Authors:
  • Alberto Apostolico;Matteo Comin;Laxmi Parida

  • Affiliations:
  • -;-;-

  • Venue:
  • DCC '04 Proceedings of the Conference on Data Compression
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present variants of classical data compression paradigms by Ziv, Lempel, and Welchin which the phrases used in compression are selected among suitably chosen motifs, definedhere as strings of intermittently solid and wild characters that recur more or less frequently inthe source textstring.This notion emerged primarily in the analysis of biological sequencesand molecules.Whereas the number of motifs in a sequence or family may be exponentialin the size of the input, a linear-sized basis of irredundant motifs may be defined such thatany other motif can be obtained by the union of a suitable subset from the basis.Previousstudy has exposed the advantages of using irredundant motifs in lossy as well as lossless off-line compression.In the present paper, we examine adaptations and extensions of classical incremental ZL and ZLW paradigms.First, hybrid schemata are proposed along these lines, in which motifs may be discovered and selected off-line, while the parse and encoding is still conducted on-line.The performances thus obtained improve on the one hand over previous off-line implementations of motif-based compression, and on the other, over the traditionally best implementations of ZLW.On the basis of this, both lossy and losslessmotif-based schemata are introduced and tested that follow more closely the ZL and ZLWparadigms.