High-error approximate dictionary search using estimate hash comparisons

Authors:
Johan Rönnblom
Affiliations:
Senapsgatan 12, Gothenburg, Sweden
Venue:
Software—Practice & Experience
Year:
2007

Citing 7
Cited 0

Signature-based text retrieval methods: a survey

Data Engineering
A comparison of approximate string matching algorithms

Software—Practice & Experience
A fast bit-vector algorithm for approximate string matching based on dynamic programming

Journal of the ACM (JACM)
Searching in metric spaces

ACM Computing Surveys (CSUR)
Tries for Approximate String Matching

IEEE Transactions on Knowledge and Data Engineering
A bit-vector algorithm for computing Levenshtein and Damerau edit distances

Nordic Journal of Computing - Special issue: Selected papers of the Prague Stringology conference (PSC'02), September 23-24, 2002
Bit-Parallel Witnesses and Their Applications to Approximate String Matching

Algorithmica

Quantified Score

Hi-index	0.00

Visualization

Abstract

A method for finding all matches in a pre-processed dictionaryfor a query string q and with at most k differencesis presented. A very fast constant-time estimate using hashes ispresented. A tree structure is used to minimize the number ofestimates made. Practical tests are performed, showing that theestimate can filter out 99% of the full comparisons for 40% errorrates and dictionaries of up to four million words. The tree isfound to be efficient up to a 50% error rate. Copyright © 2006John Wiley & Sons, Ltd.