A review of segmentation and contextual analysis techniques for text recognition
Pattern Recognition
Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
Approximate Boyer-Moore string matching
SIAM Journal on Computing
Text algorithms
A comparison of approximate string matching algorithms
Software—Practice & Experience
A fast bit-vector algorithm for approximate string matching based on dynamic programming
Journal of the ACM (JACM)
Very fast and simple approximate string matching
Information Processing Letters
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
Fast and flexible string matching by combining bit-parallelism and suffix automata
Journal of Experimental Algorithmics (JEA)
Modern Information Retrieval
Flexible pattern matching in strings: practical on-line search algorithms for texts and biological sequences
Automatic Speech and Speaker Recognition
Automatic Speech and Speaker Recognition
New and faster filters for multiple approximate string matching
Random Structures & Algorithms
Faster Bit-Parallel Approximate String Matching
CPM '02 Proceedings of the 13th Annual Symposium on Combinatorial Pattern Matching
Approximate String Matching and Local Similarity
CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
Filtration with q-Samples in Approximate String Matching
CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
Approximate Multiple Strings Search
CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
Average complexity of exact and approximate multiple string matching
Theoretical Computer Science
Average-optimal multiple approximate string matching
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Sequential and indexed two-dimensional combinatorial template matching allowing rotations
Theoretical Computer Science
Increased bit-parallelism for approximate and multiple string matching
Journal of Experimental Algorithmics (JEA)
On-line Approximate String Matching in Natural Language
Fundamenta Informaticae
Rotation and lighting invariant template matching
Information and Computation
On-Line Approximate String Matching with Bounded Errors
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
Average-optimal string matching
Journal of Discrete Algorithms
Tuning approximate Boyer-Moore for gene sequences
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Approximate string matching with Lempel-Ziv compressed indexes
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
On-line approximate string matching with bounded errors
Theoretical Computer Science
Bit-parallel search algorithms for long patterns
SEA'10 Proceedings of the 9th international conference on Experimental Algorithms
Approximate string matching with reduced alphabet
Algorithms and Applications
Average complexity of backward q-gram string matching algorithms
Information Processing Letters
On-line Approximate String Matching in Natural Language
Fundamenta Informaticae
Approximate regional sequence matching for genomic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.01 |
We present a new algorithm for multiple approximate string matching. It is based on reading backwards enough l-grams from text windows so as to prove that no occurrence can contain the part of the window read, and then shifting the window.We show analytically that our algorithm is optimal on average. Hence our first contribution is to fill an important gap in the area, since no average-optimal algorithm existed for multiple approximate string matching.We consider several variants and practical improvements to our algorithm, and show experimentally that they are resistant to the number of patterns and the fastest for low difference ratios, displacing the long-standing best algorithms. Hence our second contribution is to give a practical algorithm for this problem, by far better than any existing alternative in many cases of interest. On real-life texts, our algorithm is especially interesting for computational biology applications.In particular, we show that our algorithm can be successfully used to search for one pattern, where many more competing algorithms exist. Our algorithm is also average-optimal in this case, being the second after that of Chang and Marr. However, our algorithm permits higher difference ratios than Chang and Marr, and this is our third contribution.In practice, our algorithm is competitive in this scenario too, being the fastest for low difference ratios and moderate alphabet sizes. This is our fourth contribution, which also answers affirmatively the question of whether a practical average-optimal approximate string-matching algorithm existed.