Large-lexicon attribute-consistent text recognition in natural images

Authors:
Tatiana Novikova;Olga Barinova;Pushmeet Kohli;Victor Lempitsky
Affiliations:
Lomonosov Moscow State University, Russia;Lomonosov Moscow State University, Russia;Microsoft Research Cambridge, UK;Yandex, Russia
Venue:
ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
Year:
2012

Citing 14
Cited 0

Shape quantization and recognition with randomized trees

Neural Computation
How to squeeze a lexicon

Software—Practice & Experience
Text Recognition of Low-resolution Document Images

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
A Weighted Finite-State Framework for Correcting Errors in Natural Scene OCR

ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Word spotting in the wild

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Detecting and reading text in natural scenes

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
A method for text localization and recognition in real-world images

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part III
Hypothesis Preservation Approach to Scene Text Recognition with Weighted Finite-State Transducer

ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
ICDAR 2011 Robust Reading Competition Challenge 2: Reading Text in Scene Images

ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
Text Localization in Real-World Images Using Efficiently Pruned Exhaustive Search

ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
Enforcing similarity constraints with integer programming for better scene text recognition

CVPR '11 Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition
Real-time scene text localization and recognition

CVPR '12 Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
End-to-end scene text recognition

ICCV '11 Proceedings of the 2011 International Conference on Computer Vision

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a new model for the task of word recognition in natural images that simultaneously models visual and lexicon consistency of words in a single probabilistic model. Our approach combines local likelihood and pairwise positional consistency priors with higher order priors that enforce consistency of characters (lexicon) and their attributes (font and colour). Unlike traditional stage-based methods, word recognition in our framework is performed by estimating the maximum a posteriori (MAP) solution under the joint posterior distribution of the model. MAP inference in our model is performed through the use of weighted finite-state transducers (WFSTs). We show how the efficiency of certain operations on WFSTs can be utilized to find the most likely word under the model in an efficient manner. We evaluate our method on a range of challenging datasets (ICDAR'03, SVT, ICDAR'11). Experimental results demonstrate that our method outperforms state-of-the-art methods for cropped word recognition.