Linear Suffix Array Construction by Almost Pure Induced-Sorting

  • Authors:
  • Ge Nong;Sen Zhang;Wai Hong Chan

  • Affiliations:
  • -;-;-

  • Venue:
  • DCC '09 Proceedings of the 2009 Data Compression Conference
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a linear time and space suffix array (SA) construction algorithm called the SA-IS algorithm.The SA-IS algorithm is novel because of the LMS-substrings used for the problem reduction and the pure induced-sorting (specially coined for this algorithm)used to propagate the order of suffixes as well as that of LMS-substrings, which makes the algorithm almost purely relying on induced sorting at both its crucial steps.The pure induced-sorting renders the algorithm an elegant design and in turn a surprisingly compact implementation which consists of less than 100 lines of C code.The experimental results demonstrate that this newly proposed algorithm yields noticeably better time and space efficiencies than all the currently published linear time algorithms for SA construction.