Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation

Authors:
Jerod J. Weinman;Erik Learned-Miller;Allen R. Hanson
Affiliations:
Grinnell College, Grinnell;University of Massachusetts Amherst, Amherst;University of Massachusetts Amherst, Amherst
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2009

Citing 0
Cited 15

Word spotting in the wild

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
Scalable probabilistic databases with factor graphs and MCMC

Proceedings of the VLDB Endowment
A method for text localization and recognition in real-world images

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part III
Categorization of display ads using image and landing page features

Proceedings of the Third Workshop on Large Scale Data Mining: Theory and Applications
(Computer) vision without sight

Communications of the ACM
Bounding the probability of error for high precision optical character recognition

The Journal of Machine Learning Research
Recognizing natural scene characters by convolutional neural network and bimodal image enhancement

CBDAR'11 Proceedings of the 4th international conference on Camera-Based Document Analysis and Recognition
NEOCR: a configurable dataset for natural image text recognition

CBDAR'11 Proceedings of the 4th international conference on Camera-Based Document Analysis and Recognition
On Learning Conditional Random Fields for Stereo

International Journal of Computer Vision
Estimation and selection via absolute penalized convex minimization and its multistage adaptive applications

The Journal of Machine Learning Research
Text extraction from videos using a hybrid approach

Proceedings of the International Conference on Advances in Computing, Communications and Informatics
Large-lexicon attribute-consistent text recognition in natural images

ECCV'12 Proceedings of the 12th European conference on Computer Vision - Volume Part VI
Text extraction from scene images by character appearance and structure modeling

Computer Vision and Image Understanding
Benchmarking recognition results on camera captured word image data sets

Proceeding of the workshop on Document Analysis and Recognition
Scene text recognition and tracking to identify athletes in sport videos

Multimedia Tools and Applications

Quantified Score

Hi-index	0.16

Visualization

Abstract

Scene text recognition (STR) is the recognition of text anywhere in the environment, such as signs and storefronts. Relative to document recognition, it is challenging because of font variability, minimal language context, and uncontrolled conditions. Much information available to solve this problem is frequently ignored or used sequentially. Similarity between character images is often overlooked as useful information. Because of language priors, a recognizer may assign different labels to identical characters. Directly comparing characters to each other, rather than only a model, helps ensure that similar instances receive the same label. Lexicons improve recognition accuracy but are used post hoc. We introduce a probabilistic model for STR that integrates similarity, language properties, and lexical decision. Inference is accelerated with sparse belief propagation, a bottom-up method for shortening messages by reducing the dependency between weakly supported hypotheses. By fusing information sources in one model, we eliminate unrecoverable errors that result from sequential processing, improving accuracy. In experimental results recognizing text from images of signs in outdoor scenes, incorporating similarity reduces character recognition error by 19 percent, the lexicon reduces word recognition error by 35 percent, and sparse belief propagation reduces the lexicon words considered by 99.9 percent with a 12X speedup and no loss in accuracy.