IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient Error-Correcting Viterbi Parsing
IEEE Transactions on Pattern Analysis and Machine Intelligence
ACM Computing Surveys (CSUR)
Stochastic Error-Correcting Parsing for OCR Post-Processing
ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 4
Confidence modeling for handwriting recognition: algorithms and applications
International Journal on Document Analysis and Recognition
Precision-recall operating characteristic (P-ROC) curves in imprecise environments
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 04
Confidence estimation for NLP applications
ACM Transactions on Speech and Language Processing (TSLP)
OCR post-processing for low density languages
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Rejection strategies for offline handwritten text line recognition
Pattern Recognition Letters
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Balancing error and supervision effort in interactive-predictive handwriting recognition
Proceedings of the 15th international conference on Intelligent user interfaces
Adaptive threshold estimation via extreme value theory
IEEE Transactions on Signal Processing
On determining the radar threshold for non-Gaussian processes from experimental data
IEEE Transactions on Information Theory
Hi-index | 0.00 |
In an OCR post-processing task, a language model is used to find the best transformation of the OCR hypothesis into a string compatible with the language. The cost of this transformation is used as a confidence value to reject the strings that are less likely to be correct, and the error rate of the accepted strings should be strictly controlled by the user. In this work, the expected error rate distribution of an unknown language model is estimated from a training set composed of known language models. This means that after building a new language model, the user should be able to automatically "fix" the expected error rate at an acceptable level instead of having to deal with an arbitrary threshold.