IEEE Transactions on Pattern Analysis and Machine Intelligence
Word association norms, mutual information, and lexicography
Computational Linguistics
Character Recognition Based on Attribute-Dependent Programmed Grammar
IEEE Transactions on Pattern Analysis and Machine Intelligence
Optical recognition of hand-printed characters of any size, position, and orientation
IBM Journal of Research and Development
Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
Segmentation of merged characters by neural networks and shortest-path
SAC '93 Proceedings of the 1993 ACM/SIGAPP symposium on Applied computing: states of the art and practice
Results of applying probabilistic IR to OCR text
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
High Accuracy Optical Character Recognition Using Neural Networks with Centroid Dithering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Multiple Subclass Pattern Recognition: A Maximin Correlation Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
A New Methodology for Gray-Scale Character Segmentation and Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature Analysis Using Line Sweep Thinning Algorithm
IEEE Transactions on Pattern Analysis and Machine Intelligence
Twenty Years of Document Image Analysis in PAMI
IEEE Transactions on Pattern Analysis and Machine Intelligence
Word Spotting in Bitmapped Fax Documents
Information Retrieval
A Constrained Approach to Multifont Chinese Character Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hybrid Contextural Text Recognition with String Matching
IEEE Transactions on Pattern Analysis and Machine Intelligence
Direct Gray-Scale Extraction of Features for Character Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
A Shape Analysis Model with Applications to a Character Recognition System
IEEE Transactions on Pattern Analysis and Machine Intelligence
Handwritten Character Classification Using Nearest Neighbor in Large Databases
IEEE Transactions on Pattern Analysis and Machine Intelligence
GREC '01 Selected Papers from the Fourth International Workshop on Graphics Recognition Algorithms and Applications
A Technique for Segmentation of Gurmukhi Text
CAIP '01 Proceedings of the 9th International Conference on Computer Analysis of Images and Patterns
Statistical Analysis of Bibliographic Strings for Constructing an Integrated Document Space
ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
Segmentation of Touching Characters in Formulas
DAS '98 Selected Papers from the Third IAPR Workshop on Document Analysis Systems: Theory and Practice
Bibliographic attribute extraction from erroneous references based on a statistical model
Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Automatic Feature Selection with Applications to Script Identification of Degraded Documents
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Word association norms, mutual information, and lexicography
ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Parsing, word associations and typical predicate-argument relations
HLT '89 Proceedings of the workshop on Speech and Natural Language
Enhanced Good-Turing and Cat-Cal: two new methods for estimating probabilities of English bigrams
HLT '89 Proceedings of the workshop on Speech and Natural Language
Performance Improvement Techniques for Chinese Character Recognition
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
An approximate multi-word matching algorithm for robust document retrieval
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Proceedings of the 1st international convention on Rehabilitation engineering & assistive technology: in conjunction with 1st Tan Tock Seng Hospital Neurorehabilitation Meeting
Engineering Applications of Artificial Intelligence
Automatic metadata extraction from museum specimen labels
DCMI '08 Proceedings of the 2008 International Conference on Dublin Core and Metadata Applications
Segmentation of touching characters in upper zone in printed Gurmukhi script
Proceedings of the 2nd Bangalore Annual Compute Conference
New features using fractal multi-dimensions for generalized Arabic font recognition
Pattern Recognition Letters
Self adaptable recognizer for document image collections
PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
Multi-oriented Bangla and Devnagari text recognition
Pattern Recognition
Word spotting in scanned images using hidden Markov models
ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: image and multidimensional signal processing - Volume V
A semi-automatic adaptive OCR for digital libraries
DAS'06 Proceedings of the 7th international conference on Document Analysis Systems
Crucial combinations of parts for handwritten alphanumeric characters
Mathematical and Computer Modelling: An International Journal
An error tolerant memory aid for reduced cognitive load in number copying tasks
UAHCI'13 Proceedings of the 7th international conference on Universal Access in Human-Computer Interaction: user and context diversity - Volume 2
Hi-index | 0.15 |
We describe the current state of a system that recognizes printed text of various fonts and sizes for the Roman alphabet. The system combines several techniques in order to improve the overall recognition rate. Thinning and shape extraction are performed directly on a graph of the run-length encoding of a binary image. The resulting strokes and other shapes are mapped, using a shape-clustering approach, into binary features which are then fed into a statistical Bayesian classifier. Large-scale trials have shown better than 97 percent top choice correct performance on mixtures of six dissimilar fonts, and over 99 percent on most single fonts, over a range of point sizes. Certain remaining confusion classes are disambiguated through contour analysis, and characters suspected of being merged are broken and reclassified. Finally, layout and linguistic context are applied. The results are illustrated by sample pages.