Towards approximate matching in compressed strings: local subsequence recognition

Authors:
Alexander Tiskin
Affiliations:
Department of Computer Science, University of Warwick, Coventry, UK
Venue:
CSR'11 Proceedings of the 6th international conference on Computer science: theory and applications
Year:
2011

Citing 13
Cited 2

Approximately matching context-free languages

Information Processing Letters
Let sleeping files lie: pattern matching in Z-compressed files

Journal of Computer and System Sciences
The String-to-String Correction Problem

Journal of the ACM (JACM)
Algorithms on Compressed Strings and Arrays

SOFSEM '99 Proceedings of the 26th Conference on Current Trends in Theory and Practice of Informatics on Theory and Practice of Informatics
Collage system: a unifying framework for compressed pattern matching

Theoretical Computer Science - Selected papers in honour of Setsuo Arikawa
A Subquadratic Sequence Alignment Algorithm for Unrestricted Scoring Matrices

SIAM Journal on Computing
A Technique for High-Performance Data Compression

Computer
Semi-local longest common subsequences in subquadratic time

Journal of Discrete Algorithms
Fast and compact regular expression matching

Theoretical Computer Science
Fast distance multiplication of unit-Monge matrices

SODA '10 Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms
Window subsequence problems for compressed texts

CSR'06 Proceedings of the First international computer science conference on Theory and Applications
Querying and embedding compressed texts

MFCS'06 Proceedings of the 31st international conference on Mathematical Foundations of Computer Science
Processing compressed texts: a tractability border

CPM'07 Proceedings of the 18th annual conference on Combinatorial Pattern Matching

Faster subsequence and don't-care pattern matching on compressed texts

CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Variable-Length codes for space-efficient grammar-based compression

SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

A grammar-compressed (GC) string is a string generated by a context-free grammar. This compression model includes LZ78 and LZW compression as a special case. We consider the longest common subsequence problem and the local subsequence recognition problem on a GC-text against a plain pattern. We show that, surprisingly, both problems can be solved in time that is within a polylogarithmic factor of the best existing algorithms for the same problems on a plain text. In a wider context presented elsewhere, we use these results as a stepping stone to efficient approximate matching on a GC-text.