A four-stage algorithm for updating a Burrows-Wheeler transform

  • Authors:
  • M. Salson;T. Lecroq;M. Léonard;L. Mouchard

  • Affiliations:
  • Université de Rouen, LITIS EA 4108, 76821 Mont Saint Aignan, France;Université de Rouen, LITIS EA 4108, 76821 Mont Saint Aignan, France;Université de Rouen, LITIS EA 4108, 76821 Mont Saint Aignan, France;Université de Rouen, LITIS EA 4108, 76821 Mont Saint Aignan, France and Algorithm Design Group, Department of Computer Science, Kings College London, Strand, London WC2R 2LS, England, United ...

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2009

Quantified Score

Hi-index 5.23

Visualization

Abstract

We present a four-stage algorithm that updates the Burrows-Wheeler Transform of a text T, when this text is modified. The Burrows-Wheeler Transform is used by many text compression applications and some self-index data structures. It operates by reordering the letters of a text T to obtain a new text bwt(T) which can be better compressed. Even though recent advances are offering this structure new applications, a major bottleneck still exists: bwt(T) has to be entirely reconstructed from scratch whenever T is modified. We study how standard edit operations (insertion, deletion, substitution of a letter or a factor) that transform a text T into T^' impact bwt(T). Then we present an algorithm that directly converts bwt(T) into bwt(T^'). Based on this algorithm, we also sketch a method for converting the suffix array of T into the suffix array of T^'. We finally show, based on the experiments we conducted, that this algorithm, whose worst-case time complexity is O(|T|log|T|(1+log@s/loglog|T|)), performs really well in practice and replaces advantageously the traditional approach.