A logical framework for the correction of spelling errors in electronic documents
Information Processing and Management: an International Journal
IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient Error-Correcting Viterbi Parsing
IEEE Transactions on Pattern Analysis and Machine Intelligence
A design principles of a weighted finite-state transducer library
Theoretical Computer Science - Special issue on implementing automata
ACM Computing Surveys (CSUR)
Probabilistic Finite-State Machines-Part I
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Weighted Finite-State Framework for Correcting Errors in Natural Scene OCR
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Phrase-based correction model for improving handwriting recognition accuracies
Pattern Recognition
OpenFst: a general and efficient weighted finite-state transducer library
CIAA'07 Proceedings of the 12th international conference on Implementation and application of automata
Hi-index | 0.00 |
In this paper, an OCR post-processing method that combines a language model, OCR hypothesis information and an error model is proposed. The approach can be seen as a flexible and efficient way to perform Stochastic Error-Correcting Language Modeling. We use Weighted Finite-State Transducers (WFSTs) to represent the language model, the complete set of OCR hypotheses interpreted as a sequence of vectors of a posteriori class probabilities, and an error model with symbol substitutions, insertions and deletions. This approach combines the practical advantages of a de-coupled (OCR + post-processor) model with the error-recovery power of a integrated model.