Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
The Enhanced Suffix Array and Its Applications to Genome Analysis
WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
Replacing suffix trees with enhanced suffix arrays
Journal of Discrete Algorithms - SPIRE 2002
Algorithms for pattern matching and discovery in RNA secondary structure
Theoretical Computer Science - Pattern discovery in the post genome
ACM Computing Surveys (CSUR)
Computing suffix links for suffix trees and arrays
Information Processing Letters
Linear-time construction of suffix arrays
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Pattern discovery in RNA secondary structure using affix trees
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Time and space efficient search for small alphabets with suffix arrays
FSKD'05 Proceedings of the Second international conference on Fuzzy Systems and Knowledge Discovery - Volume Part I
Bidirectional search in a string with wavelet trees
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Component-based matching for multiple interacting RNA sequences
ISBRA'11 Proceedings of the 7th international conference on Bioinformatics research and applications
Bidirectional search in a string with wavelet trees and bidirectional matching statistics
Information and Computation
SEA'12 Proceedings of the 11th international conference on Experimental Algorithms
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Hi-index | 5.23 |
Efficient string-processing in large data sets like complete genomes is strongly connected to the suffix tree and similar index data structures. With respect to complex structural string analysis like the search for RNA secondary structure patterns, unidirectional suffix tree algorithms are inferior to bidirectional algorithms based on the affix tree data structure. The affix tree incorporates the suffix tree and the suffix tree of the reverse text in one tree structure but suffers from its large memory requirements. In this paper I present a new data structure, denoted affix array, which is equivalent to the affix tree with respect to its algorithmic functionality, but with smaller memory requirements and improved performance. I will show a linear time construction of the affix array without making use of the linear time construction of the affix tree. I will also show how bidirectional affix tree traversals can be transferred to the affix array and present the impressive results of large scale RNA secondary structure analysis based on the new data structure.