Video text recognition using sequential Monte Carlo and error voting methods

Authors:
Datong Chen;Jean-Marc Odobez
Affiliations:
IDIAP Research Institute, Rue du Simplon 4, 1920 Martigny, Valais, Switzerland;IDIAP Research Institute, Rue du Simplon 4, 1920 Martigny, Valais, Switzerland
Venue:
Pattern Recognition Letters
Year:
2005

Citing 15
Cited 4

Color indexing

International Journal of Computer Vision
Texture Features for Browsing and Retrieval of Image Data

IEEE Transactions on Pattern Analysis and Machine Intelligence
Finding text in images

DL '97 Proceedings of the second ACM international conference on Digital libraries
Text enhancement in digital video using multiple frame integration

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Intelligent Indexing and Semantic Retrieval of Multimodal Documents

Information Retrieval
StrCombo: combination of string recognizers

Pattern Recognition Letters - In memory of Professor E.S. Gelsema
Contour Tracking by Stochastic Propagation of Conditional Density

ECCV '96 Proceedings of the 4th European Conference on Computer Vision-Volume I - Volume I
Object Tracking with an Adaptive Color-Based Particle Filter

Proceedings of the 24th DAGM Symposium on Pattern Recognition
Video OCR: indexing digital new libraries by recognition of superimposed captions

Multimedia Systems - Special section on video libraries
Identification of Text on Colored Book and Journal Covers

ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
A Video Text Extraction Method for Character Recognition

ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
High-Speed, High-Accuracy Binarization Method for Recognizing Text in Images of Low Spatial Resolutions

ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Video Google: A Text Retrieval Approach to Object Matching in Videos

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Automatic text detection and tracking in digital video

IEEE Transactions on Image Processing
Localizing and segmenting text in images and videos

IEEE Transactions on Circuits and Systems for Video Technology

A review of text and image retrieval approaches for broadcast news video

Information Retrieval
Learning-based license plate detection in vehicle image database

International Journal of Intelligent Information and Database Systems
A novel ring radius transform for video character reconstruction

Pattern Recognition
An approach for Bangla and Devanagari video text recognition

Proceedings of the 4th International Workshop on Multilingual OCR

Quantified Score

Hi-index	0.10

Visualization

Abstract

This paper addresses the issue of segmentation and recognition of text embedded in video sequences from their associated text image sequence extracted by a text detection module. To this end, we propose a probabilistic algorithm based on Bayesian adaptive thresholding and Monte-Carlo sampling. The algorithm approximates the posterior distribution of segmentation thresholds of text pixels in an image by a set of weighted samples. The set of samples is initialized by applying a classical segmentation algorithm on the first video frame and further refined by random sampling under a temporal Bayesian framework. One important contribution of the paper is to show that, thanks to the proposed methodology, the likelihood of a segmentation parameter sample can be estimated not using a classification criterion or a visual quality criterion based on the produced segmentation map, but directly from the induced text recognition result, which is directly relevant to our task. Furthermore, as a second contribution of the paper, we propose to align text recognition results from high confidence samples gathered over time, to composite a final result using error voting technique (ROVER) at the character level. Experiments are conducted on a two hour video database. Character recognition rates higher than 93%, and word error rates higher than 90% are achieved, which are 4% and 3% more than state-of-the-art methods applied to the same database.