Fundamentals of speech recognition
Fundamentals of speech recognition
Twenty Years of Document Image Analysis in PAMI
IEEE Transactions on Pattern Analysis and Machine Intelligence
IEEE Transactions on Pattern Analysis and Machine Intelligence
Locating and Recognizing Text in WWW Images
Information Retrieval
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 2 - Volume 02
Web image indexing by using associated texts
Knowledge and Information Systems
A HMM-based approach to recognize ultra low resolution anti-aliased words
PReMI'07 Proceedings of the 2nd international conference on Pattern recognition and machine intelligence
Hi-index | 0.00 |
In this paper, we introduce and evaluate a system capable of recognizing ultra low resolution words extracted from images such as those frequently embedded on web pages. The design of the system has been driven by the following constraints. First, the system has to recognize small font sizes where antialiasing and resampling procedures have been applied. Such procedures add noise on the patterns and complicate any a priori segmentation of the characters. Second, the system has to be able to recognize any words in an open vocabulary setting, potentially mixing different languages. Finally, the training procedure must be automatic, i.e. without requesting to extract, segment and label manually a large set of data. These constraints led us to an architecture based on ergodic HMMs where states are associated to the characters. We also introduce several improvements of the performance increasing the order of the emission probability estimators and including minimum and maximum duration constraints on the character models. The proposed system is evaluated on different font sizes and families, showing good robustness for sizes down to 6 points.