Algorithms for approximate string matching
Information and Control
An improved algorithm for computing the edit distance of run-length coded strings
Information Processing Letters
Let sleeping files lie: pattern matching in Z-compressed files
Journal of Computer and System Sciences
Matching for run-length encoded strings
Journal of Complexity
The String-to-String Correction Problem
Journal of the ACM (JACM)
Fast Two-Dimensional Approximate Pattern Matching
LATIN '98 Proceedings of the Third Latin American Symposium on Theoretical Informatics
Approximate String Matching over Ziv-Lempel Compressed Text
COM '00 Proceedings of the 11th Annual Symposium on Combinatorial Pattern Matching
A Unifying Framework for Compressed Pattern Matching
SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
Bit-Parallel Approach to Approximate String Matching in Compressed Texts
SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Multiple Pattern Matching in LZW Compressed Text
DCC '98 Proceedings of the Conference on Data Compression
Faster Approximate String Matching over Compressed Text
DCC '01 Proceedings of the Data Compression Conference
Edit distance of run-length encoded strings
Information Processing Letters
The SBC-tree: an index for run-length compressed sequences
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Computing similarity of run-length encoded strings with affine gap penalty
Theoretical Computer Science
Random access to grammar-compressed strings
Proceedings of the twenty-second annual ACM-SIAM symposium on Discrete Algorithms
Computing similarity of run-length encoded strings with affine gap penalty
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Speeding up HMM decoding and training by exploiting sequence repetitions
CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching
Hi-index | 0.00 |
We focus on the problem of approximate matching of strings that have been compressed using run-length encoding. Previous studies have concentrated on the problem of computing the longest common subsequence (LCS) between two strings of length m and n, compressed to m驴 and n驴 runs. We extend an existing algorithm for the LCS to the Levenshtein distance achieving O(m驴 n+n驴 m) complexity. This approach gives also an algorithm for approximate searching of a pattern of m letters (m驴 runs) in a text of n letters (n驴 runs) in O(mm驴 n驴) time, both for LCS and Levenshtein models. Then we propose improvements for a greedy algorithm for the LCS, and conjecture that the improved algorithm has O(m驴 n驴) expected case complexity. Experimental results are provided to support the conjecture.