Text Detection in Natural Scene Images by Stroke Gabor Words

Authors:
Chucai Yi;Yingli Tian
Affiliations:
-;-
Venue:
ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
Year:
2011

Citing 0
Cited 3

Camera-Based signage detection and recognition for blind persons

ICCHP'12 Proceedings of the 13th international conference on Computers Helping People with Special Needs - Volume Part II
Text extraction from scene images by character appearance and structure modeling

Computer Vision and Image Understanding
Text location in color images suitable for smartphone

Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service

Quantified Score

Hi-index	0.02

Visualization

Abstract

In this paper, we propose a novel algorithm, based on stroke components and descriptive Gabor filters, to detect text regions in natural scene images. Text characters and strings are constructed by stroke components as basic units. Gabor filters are used to describe and analyze the stroke components in text characters or strings. We define a suitability measurement to analyze the confidence of Gabor filters in describing stroke component and the suitability of Gabor filters on an image window. From the training set, we compute a set of Gabor filters that can describe principle stroke components of text by their parameters. Then a K-means algorithm is applied to cluster the descriptive Gabor filters. The clustering centers are defined as Stroke Gabor Words (SGWs) to provide a universal description of stroke components. By suitability evaluation on positive and negative training samples respectively, each SGW generates a pair of characteristic distributions of suitability measurements. On a testing natural scene image, heuristic layout analysis is applied first to extract candidate image windows. Then we compute the principle SGWs for each image window to describe its principle stroke components. Characteristic distributions generated by principle SGWs are used to classify text or non-text windows. Experimental results on benchmark datasets demonstrate that our algorithm can handle complex backgrounds and variant text patterns (font, color, scale, etc.).