Localization, Extraction and Recognition of Text in Telugu Document Images

Authors:
Atul Negi;K. Nikhil Shanker;Chandra Kanth Chereddi
Affiliations:
-;-;-
Venue:
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Year:
2003

Citing 3
Cited 3

TextFinder: An Automatic System to Detect and Recognize Text In Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
Digital Image Processing

Digital Image Processing
An OCR System for Telugu

ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition

Automatic detection and recognition of Korean text in outdoor signboard images

Pattern Recognition Letters
A syntactic PR approach to Telugu handwritten character recognition

Proceeding of the workshop on Document Analysis and Recognition
Zoning methods for handwritten character recognition: A survey

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a system to locate, extract andrecognize Telugu text. The circular nature of Telugu scriptis exploited for segmenting text regions using the HoughTransform. First, the Hough Transform for circles is performedon the Sobel gradient magnitude of the image tolocate text. The located circles are filled to yield text regions,followed by Recursive XY Cuts to segment the regionsinto paragraphs, lines and word regions. A regionmerging process with a bottom-up approach envelopes individualwords. Local binarization of the word MBRs yieldsconnected components containing glyphs for recognition.The recognition process first identifies candidate charactersby a zoning technique and then constructs structural featurevectors by cavity analysis. Finally, if required, crossingcount based non-linear normalization and scaling is performedbefore template matching. The segmentation processsucceeds in extracting text from images with complexNon-Manhattan layouts. The recognition process gave acharacter recognition accuracy of 97%-98%.