Algorithms
A very fast substring search algorithm
Communications of the ACM
Introduction to algorithms
Analysis of Boyer-Moore-Horspool string-matching heuristic
Random Structures & Algorithms - Special issue: average-case analysis of algorithms
A fast string searching algorithm
Communications of the ACM
Handbook of Exact String Matching Algorithms
Handbook of Exact String Matching Algorithms
ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part II
Improving Boyer-Moore-Horspool using machine-words for comparison
Proceedings of the 48th Annual Southeast Regional Conference
The exact online string matching problem: A review of the most recent results
ACM Computing Surveys (CSUR)
Hi-index | 5.23 |
The string matching problem, i.e. the task of finding all occurrences of one string as a substring of another one, is a fundamental problem in computer science. Recently, this problem received a great deal of attention due to numerous applications in computational biology. In this paper we address a modified version of Horspool's string matching algorithm using the probabilities of the different symbols to speed up the search. We show that the modified algorithm has a linear average running time; a precise asymptotical representation of the running time will be proven. A comparison of the average running time of the modified algorithm with well-known results for the original method shows that a substantial speed up for most of the symbol distributions has been achieved. Finally, we show that the distribution of the symbols can be approximated to a high precision using a random sample of sublinear size.