New indices for text: PAT Trees and PAT arrays
Information retrieval
Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
An improved data structure for cumulative probability tables
Software—Practice & Experience
Succinct representations of lcp information and improvements in the compressed suffix arrays
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Extended application of suffix trees to data compression
DCC '96 Proceedings of the Conference on Data Compression
Opportunistic data structures with applications
FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science
Replacing suffix trees with enhanced suffix arrays
Journal of Discrete Algorithms - SPIRE 2002
ACM Computing Surveys (CSUR)
Compressed indexes for dynamic text collections
ACM Transactions on Algorithms (TALG)
Compressed Suffix Trees with Full Functionality
Theory of Computing Systems
Computing Longest Previous Factor in linear time and applications
Information Processing Letters
A Simple Algorithm for Computing the Lempel Ziv Factorization
DCC '08 Proceedings of the Data Compression Conference
Fast and Practical Algorithms for Computing All the Runs in a String
CPM '07 Proceedings of the 18th annual symposium on Combinatorial Pattern Matching
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Theoretical and practical improvements on the RMQ-Problem, with applications to LCA and LCE
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
Dynamic rank-select structures with applications to run-length encoded texts
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
A new succinct representation of RMQ-information and improvements in the enhanced suffix array
ESCAPE'07 Proceedings of the First international conference on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies
Lempel-Ziv factorization revisited
CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Space-Efficient Preprocessing Schemes for Range Minimum Queries on Static Arrays
SIAM Journal on Computing
Computing lempel-ziv factorization online
MFCS'12 Proceedings of the 37th international conference on Mathematical Foundations of Computer Science
A comparison of index-based lempel-Ziv LZ77 factorization algorithms
ACM Computing Surveys (CSUR)
Computing regularities in strings: A survey
European Journal of Combinatorics
On compressing and indexing repetitive sequences
Theoretical Computer Science
Hi-index | 0.00 |
We present a novel algorithm for finding the longest factors in a text, for which the working space is proportional to the history text size. Moreover, our algorithm is online and exact; in that, unlike the previous batch algorithms [4, 5, 6, 7, 14], which needs to read the entire input beforehand, our algorithm reports the longest match just after reading each character. This algorithm can be directly used for data compression, pattern analysis, and data mining. Our algorithm also supports the window buffer, in that we can bound the working space by discarding the history from the oldest character. Using the dynamic rank/select dictionary [17], our algorithm requires nlog茂戮驴+ O(nlog茂戮驴) + O(n) bits of working space, and O(log3n) time per character, O(nlog3n) total time, nis the length of the history, and 茂戮驴is the alphabet size. We implemented our algorithm and compared it with the recent algorithms [4, 5, 14] in terms of speed and the working space. We found that our algorithm can work with a smaller working space, less than 1/2 of those for the previous methods in real-world data, and with a reasonable decline in speed.