An O(n log n) algorithm for finding all repetitions in a string
Journal of Algorithms
SIAM Journal on Computing
Detecting leftmost maximal periodicities
Discrete Applied Mathematics - Combinatorics and complexity
Tree pattern matching and subset matching in deterministic O(n log3 n)-time
Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms
On improving the worst case running time of the Boyer-Moore string matching algorithm
Communications of the ACM
A fast string searching algorithm
Communications of the ACM
Efficient string matching: an aid to bibliographic search
Communications of the ACM
Partial words and a theorem of Fine and Wilf revisited
Theoretical Computer Science
String regularities with don't cares
Nordic Journal of Computing - Special issue: Selected papers of the Prague Stringology conference (PSC'02), September 23-24, 2002
Border array on bounded alphabet
Journal of Automata, Languages and Combinatorics
Algorithms on Strings
Fast pattern-matching on indeterminate strings
Journal of Discrete Algorithms
The constrained longest common subsequence problem for degenerate strings
CIAA'07 Proceedings of the 12th international conference on Implementation and application of automata
Indeterminate string inference algorithms
Journal of Discrete Algorithms
Linear time inference of strings from cover arrays using a binary alphabet
WALCOM'12 Proceedings of the 6th international conference on Algorithms and computation
Hi-index | 0.00 |
In this paper we consider the prefix array π =π[1..n] of a string x =x[1..n] in which π[1]=0 and, for i 1, π[i = k iff k is the largest integersuch that x[i..i+k-1]. The prefix array πis closely related to the border array β: an integerarray [1..n ] such that β[i = kiff the length of the longest border of x[1..i] isk . Border arrays or their variants are used in many stringalgorithms and prefix arrays can be used directly forpattern-matching. It is well known that for regular strings πprovides all the information that β does; we showhowever that for indeterminate strings (those containing entriesthat match a subset of the alphabet) π actually provides moreinformation, in fact still enabling all the borders of every prefixof x to be specified. Since a lot of the entries of π areexpected to be zeros, it is natural to represent π in compressedform using integer arrays POS[1..m] and LEN[1..m],where m is the number of nonzero entries in π andπ[POS[j]] = LEN [j] iff the $j^{\mbox{th}}$nonzero entry in π occurs in position POS[j] and takesthe value LEN [j]. The expected value of m isn /σ - 1, where σ is thealphabet size. The straightforward way of computing POS/LENrequires computing π first, therefore requiresO (n ) extra space. We describe twoθ (n )-time algorithms PL1 & PL2 tocompute POS/LEN for regular strings using only 8m bytes ofstorage in addition to the n bytes required for x.PL1 requires about one-third the time of the standard border arrayalgorithm MP on English-language strings; PL2 executes faster thanMP on both English-language and highly periodic strings on{a ,b }. For indeterminate strings, we describe anextension IPL of PL1 that computes POS/LEN in O (n 2) worst-case time (though generally much faster), stillusing only 8m bytes of additional storage. For bothregular and indeterminate strings, the compressed form of π canbe used for efficient pattern-matching.