Multi-script and multi-oriented text localization from scene images

Authors:
Thotreingam Kasar;Angarai G. Ramakrishnan
Affiliations:
Medical Intelligence and Language Engineering Laboratory, Department of Electrical Engineering, Indian Institute of Science, Bangalore, India;Medical Intelligence and Language Engineering Laboratory, Department of Electrical Engineering, Indian Institute of Science, Bangalore, India
Venue:
CBDAR'11 Proceedings of the 4th international conference on Camera-Based Document Analysis and Recognition
Year:
2011

Citing 9
Cited 3

TextFinder: An Automatic System to Detect and Recognize Text In Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fuzzy Segmentation of Characters in Web Images Based on Human Colour Perception

DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Gabor Filter Based Block Energy Analysis for Text Extraction from Digital Document Images

DIAL '04 Proceedings of the First International Workshop on Document Image Analysis for Libraries (DIAL'04)
Text Locating Competition Results

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
A new wavelet-median-moment based method for multi-oriented video text detection

DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems
Detecting and reading text in natural scenes

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
A method for text localization and recognition in real-world images

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part III
MAST: multi-script annotation toolkit for scenic text

Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
Automatic text detection and tracking in digital video

IEEE Transactions on Image Processing

MAST: multi-script annotation toolkit for scenic text

Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
Benchmarking recognition results on camera captured word image data sets

Proceeding of the workshop on Document Analysis and Recognition
Multi-script robust reading competition in ICDAR 2013

Proceedings of the 4th International Workshop on Multilingual OCR

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a new method of color text localization from generic scene images containing text of different scripts and with arbitrary orientations. A representative set of colors is first identified using the edge information to initiate an unsupervised clustering algorithm. Text components are identified from each color layer using a combination of a support vector machine and a neural network classifier trained on a set of low-level features derived from the geometric, boundary, stroke and gradient information. Experiments on camera-captured images that contain variable fonts, size, color, irregular layout, non-uniform illumination and multiple scripts illustrate the robustness of the method. The proposed method yields precision and recall of 0.8 and 0.86 respectively on a database of 100 images. The method is also compared with others in the literature using the ICDAR 2003 robust reading competition dataset.