Algorithms for approximate string matching
Information and Control
Fast approximate string matching
Software—Practice & Experience
A spelling correction method and its application to an OCR system
Pattern Recognition
On partitioning a dictionary for visual text recognition
Pattern Recognition
Fast dictionary look-up for contextual word recognition
Pattern Recognition
Fast text searching: allowing errors
Communications of the ACM
An approximate string-matching algorithm
Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Approximate string-matching with q-grams and maximal matches
Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
Fast string matching using an n-gram algorithm
Software—Practice & Experience
String searching algorithms
Finding approximate matches in large lexicons
Software—Practice & Experience
The String-to-String Correction Problem
Journal of the ACM (JACM)
Very fast and simple approximate string matching
Information Processing Letters
Integrating diverse knowledge sources in text recognition
ACM Transactions on Information Systems (TOIS)
Retrieval of misspelled names in an airlines passenger record system
Communications of the ACM
A guided tour to approximate string matching
ACM Computing Surveys (CSUR)
Computer Text Recognition and Error Correction
Computer Text Recognition and Error Correction
Automata and Computability
Flexible pattern matching in strings: practical on-line search algorithms for texts and biological sequences
Introduction To Automata Theory, Languages, And Computation
Introduction To Automata Theory, Languages, And Computation
Approximate Multiple Strings Search
CPM '96 Proceedings of the 7th Annual Symposium on Combinatorial Pattern Matching
Lexical postprocessing by heuristic search and automatic determination of the edit costs
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Incremental construction of minimal acyclic finite-state automata
Computational Linguistics - Special issue on finite-state methods in NLP
Adaptive text correction with Web-crawled domain-dependent dictionaries
ACM Transactions on Speech and Language Processing (TSLP)
EXTRA: a system for example-based translation assistance
Machine Translation
Application of q-Gram Distance in Digital Forensic Search
IWCF '08 Proceedings of the 2nd international workshop on Computational Forensics
Ordering the suggestions of a spellchecker without using context*
Natural Language Engineering
Fast error-tolerant search on very large texts
Proceedings of the 2009 ACM symposium on Applied Computing
AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Non-interactive OCR post-correction for giga-scale digitization projects
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Improved fast similarity search in dictionaries
SPIRE'10 Proceedings of the 17th international conference on String processing and information retrieval
Managing misspelled queries in IR applications
Information Processing and Management: an International Journal
Indexing methods for approximate dictionary searching: Comparative analysis
Journal of Experimental Algorithmics (JEA)
Deciding word neighborhood with universal neighborhood automata
Theoretical Computer Science
Efficiently generating correction suggestions for garbled tokens of historical language
Natural Language Engineering
A fast and accurate method for approximate string search
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Computation of similarity: similarity search as computation
CiE'11 Proceedings of the 7th conference on Models of computation in context: computability in Europe
A dictionary-based approach to fast and accurate name matching in large law enforcement databases
ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
Super-Linear indices for approximate dictionary searching
SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
WallBreaker: overcoming the wall effect in similarity search
Proceedings of the Joint EDBT/ICDT 2013 Workshops
Efficient fuzzy search in large text collections
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
The need to correct garbled strings arises in many areas of natural language processing. If a dictionary is available that covers all possible input tokens, a natural set of candidates for correcting an erroneous input P is the set of all words in the dictionary for which the Levenshtein distance to P does not exceed a given (small) bound k. In this article we describe methods for efficiently selecting such candidate sets. After introducing as a starting point a basic correction method based on the concept of a "universal Levenshtein automaton," we show how two filtering methods known from the field of approximate text search can be used to improve the basic procedure in a significant way. The first method, which uses standard dictionaries plus dictionaries with reversed words, leads to very short correction times for most classes of input strings. Our evaluation results demonstrate that correction times for fixed-distance bounds depend on the expected number of correction candidates, which decreases for longer input words. Similarly the choice of an optimal filtering method depends on the length of the input words.