Trends in suffix sorting: a survey of low memory algorithms

  • Authors:
  • Jasbir Dhaliwal;Simon J. Puglisi;Andrew Turpin

  • Affiliations:
  • RMIT University, Melbourne, Australia;RMIT University, Melbourne, Australia;University of Melbourne, Melbourne, Australia

  • Venue:
  • ACSC '12 Proceedings of the Thirty-fifth Australasian Computer Science Conference - Volume 122
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The suffix array is a sorted array of all the suffixes in a string. This remarkably simple data structure is fundamental for string processing and lies at the heart of efficient algorithms for pattern matching, pattern mining, and data compression. In many applications suffix array construction, or equivalently suffix sorting, is a computational bottleneck and so has been the focus of intense research in the last 20 years. This paper outlines several suffix array construction algorithms that have emerged since the survey due to Puglisi, Smyth and Turpin [ACM Computing Surveys 39, 2007]. These algorithms have tended to strive for small working space (RAM), often at the cost of runtime, and make use of compressed data structures or secondary memory (disk) to achieve this goal. We provide a high-level description of each algorithm, avoiding implementation details as much as possible, and outline directions that could benefit from further research.