Property matching and weighted matching

Authors:
Amihood Amir;Eran Chencinski;Costas Iliopoulos;Tsvi Kopelowitz;Hui Zhang
Affiliations:
Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel;Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel;Department of Computer Science, Kings College London, Strand, London WC2R 2LS, United Kingdom;Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel;Department of Computer Science, Kings College London, Strand, London WC2R 2LS, United Kingdom
Venue:
Theoretical Computer Science
Year:
2008

Citing 22
Cited 7

Efficient pattern matching with scaling

Journal of Algorithms
Efficient special cases of Pattern Matching with Swaps

Information Processing Letters
Fast incremental text editing

Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
An efficient algorithm for dynamic text indexing

SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Real scaled matching

Information Processing Letters
An Extension of the String-to-String Correction Problem

Journal of the ACM (JACM)
A Space-Economical Suffix Tree Construction Algorithm

Journal of the ACM (JACM)
Alphabet-independent and scaled dictionary matching

Journal of Algorithms
A fast string searching algorithm

Communications of the ACM
Pattern matching with swaps

Journal of Algorithms
Text indexing and dictionary matching with one error

Journal of Algorithms
Efficient algorithms for document retrieval problems

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Introduction to Algorithms

Introduction to Algorithms
Approximate swapped matching

Information Processing Letters
Perfect Hashing for Strings: Formalization and Algorithms

CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
Overlap matching

Information and Computation
On the complexity of the Extended String-to-String Correction Problem

STOC '75 Proceedings of seventh annual ACM symposium on Theory of computing
Efficient approximate and dynamic matching of patterns using a labeling paradigm

FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science
Randomized Swap Matching in $O(m \log m \log |\Sigma| )$ time

Randomized Swap Matching in $O(m \'log m \'log |\'Sigma| )$ time
Dictionary matching and indexing with errors and don't cares

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Efficient one-dimensional real scaled matching

Journal of Discrete Algorithms
Function matching: algorithms, applications, and a lower bound

ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming

Faster index for property matching

Information Processing Letters
Errata for “Faster index for property matching”

Information Processing Letters
The property suffix tree with dynamic properties

CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Substring range reporting

CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Polynomial-time approximation algorithms for weighted LCS problem

CPM'11 Proceedings of the 22nd annual conference on Combinatorial pattern matching
Weighted shortest common supersequence

SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Compressed property suffix trees

Information and Computation

Quantified Score

Hi-index	5.23

Visualization

Abstract

In many pattern matching applications the text has some properties attached to its various parts. Pattern Matching with Properties (Property Matching, for short), involves a string matching between the pattern and the text, and the requirement that the text part satisfies some property. Some immediate examples come from molecular biology where it has long been a practice to consider special areas in the genome by their structures. It is straightforward to do sequential matching in a text with properties. However, indexing in a text with properties becomes difficult if we desire the time to be output dependent. We present an algorithm for indexing a text with properties in O(nlog|@S|+nloglogn) time for preprocessing and O(|P|log|@S|+tocc"@p) per query, where n is the length of the text, P is the sought pattern, @S is the alphabet, and tocc"@p is the number of occurrences of the pattern that satisfy some property @p. As a practical use of Property Matching we show how to solve Weighted Matching problems using techniques from Property Matching. Weighted sequences have recently been introduced as a tool to handle a set of sequences that are not identical but have many local similarities. The weighted sequence is a ''statistical image'' of this set, where we are given the probability of every symbol's occurrence at every text location. Weighted matching problems are pattern matching problems where the given text is weighted. We present a reduction from Weighted Matching to Property Matching that allows off-the-shelf solutions to numerous weighted matching problems including indexing, swapped matching, parameterized matching, approximate matching, and many more. Assuming that one seeks the occurrence of pattern P with probability @e in weighted text T of length n, we reduce the problem to a property matching problem of pattern P in text T^' of length O(n(1@e)^2log1@e).