Linear Time Suffix Array Construction Using D-Critical Substrings

  • Authors:
  • Ge Nong;Sen Zhang;Wai Hong Chan

  • Affiliations:
  • Computer Science Department, Sun Yat-Sen University, P.R.C.;Dept. of Math., Comp. Sci. and Stat., SUNY College at Oneonta, U.S.A.;Department of Mathematics, Hong Kong Baptist University, Hong Kong,

  • Venue:
  • CPM '09 Proceedings of the 20th Annual Symposium on Combinatorial Pattern Matching
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present in detail a new efficient linear time and space suffix array construction algorithm(SACA), called the D-Critical-Substring algorithm. The algorithm is built upon a novel concept called fixed-size D-Critical-Substrings, which allow us to compute suffix arrays through a balanced combination of the bucket-sort and the induction sort. The D-Critical-Substring algorithm is very simple, a fully-functioning sample implementation of which in C++ is embodied in only about 100 effective lines. The results of the experiment that we conducted on the data from the Canterbury and Manzini-Ferragina corpora indicate that our algorithm outperforms the two previously best-known linear time algorithms: the Kärkkäinen-Sanders (KS) and the Ko-Aluru (KA) algorithms.