Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Suffix arrays: a new method for on-line string searches
SODA '90 Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Constructing Suffix Trees On-Line in Linear Time
Proceedings of the IFIP 12th World Computer Congress on Algorithms, Software, Architecture - Information Processing '92, Volume 1 - Volume I
Optimal on-line search and sublinear time update in string matching
FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Compact suffix array: a space-efficient full-text index
Fundamenta Informaticae - Special issue on computing patterns in strings
Compressed Index for Dynamic Text
DCC '04 Proceedings of the Conference on Data Compression
Succinct suffix arrays based on run-length encoding
Nordic Journal of Computing
ACM Computing Surveys (CSUR)
Linear work suffix array construction
Journal of the ACM (JACM)
Compressed representations of sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
A taxonomy of suffix array construction algorithms
ACM Computing Surveys (CSUR)
Rank and select revisited and extended
Theoretical Computer Science
Compressed Suffix Trees with Full Functionality
Theory of Computing Systems
Dynamic entropy-compressed sequences and full-text indexes
ACM Transactions on Algorithms (TALG)
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
A four-stage algorithm for updating a Burrows-Wheeler transform
Theoretical Computer Science
Succinct dynamic dictionaries and trees
ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
LATIN'08 Proceedings of the 8th Latin American conference on Theoretical informatics
Improved dynamic rank-select entropy-bound structures
LATIN'08 Proceedings of the 8th Latin American conference on Theoretical informatics
Compressed Suffix Arrays for Massive Data
SPIRE '09 Proceedings of the 16th International Symposium on String Processing and Information Retrieval
Stream-based translation models for statistical machine translation
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Position heaps: A simple and dynamic text indexing data structure
Journal of Discrete Algorithms
Complex Event Detection in Extremely Resource-Constrained Wireless Sensor Networks
Mobile Networks and Applications
On suffix extensions in suffix trees
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
On the number of elements to reorder when updating a suffix array
Journal of Discrete Algorithms
On-line suffix tree construction with reduced branching
Journal of Discrete Algorithms
On suffix extensions in suffix trees
Theoretical Computer Science
Efficient indexing techniques for record matching and deduplication
International Journal of Computational Vision and Robotics
Hi-index | 0.00 |
The suffix tree data structure has been intensively described, studied and used in the eighties and nineties, its linear-time construction counterbalancing his space-consuming requirements. An equivalent data structure, the suffix array, has been described by Manber and Myers in 1990. This space-economical structure has been neglected during more than a decade, its construction being too slow. Since 2003, several linear-time suffix array construction algorithms have been proposed, and this structure has slowly replaced the suffix tree in many string processing problems. All these constructions are building the suffix array from the text, and any edit operation on the text leads to the construction of a brand new suffix array. In this article, we are presenting an algorithm that modifies the suffix array and the Longest Common Prefix (LCP) array when the text is edited (insertion, substitution or deletion of a letter or a factor). This algorithm is based on a recent four-stage algorithm developed for dynamic Burrows-Wheeler Transforms (BWT). For minimizing the space complexity, we are sampling the Suffix Array, a technique used in BWT-based compressed indexes. We furthermore explain how this technique can be adapted for maintaining a sample of the Extended Suffix Array, containing a sample of the Suffix Array, a sample of the Inverse Suffix Array and the whole LCP array. Our practical experiments show that it operates very well in practice, being quicker than the fastest suffix array construction algorithm.