A syntactic PR approach to Telugu handwritten character recognition

Authors:
Samit Kumar Pradhan;Atul Negi
Affiliations:
University of Hyderabad, Hyderabad, India;University of Hyderabad, Hyderabad, India
Venue:
Proceeding of the workshop on Document Analysis and Recognition
Year:
2012

Citing 14
Cited 0

Algorithms for approximate string matching

Information and Control
Artificial intelligence: a knowledge-based approach

Artificial intelligence: a knowledge-based approach
A Fast and Flexible Thinning Algorithm

IEEE Transactions on Computers
Algebraic Description of Curve Structure

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Handwritten Character Recognition System Using Directional Element Feature and Asymmetric Mahalanobis Distance

IEEE Transactions on Pattern Analysis and Machine Intelligence
The String-to-String Correction Problem

Journal of the ACM (JACM)
Tries for Approximate String Matching

IEEE Transactions on Knowledge and Data Engineering
An OCR System for Telugu

ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Localization, Extraction and Recognition of Text in Telugu Document Images

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Digital Image Processing (3rd Edition)

Digital Image Processing (3rd Edition)
Handwritten Character Recognition Using Gradient Feature and Quadratic Classifier with Multiple Discrimination Schemes

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
A novel look-ahead optimization strategy for trie-based approximate string matching

Pattern Analysis & Applications
Off-Line Handwritten Character Recognition of Devnagari Script

ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 01
Robust Text Line, Word And Character Extraction from Telugu Document Image

ICETET '09 Proceedings of the 2009 Second International Conference on Emerging Trends in Engineering & Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper shows a character recognition mechanism based on a syntactic PR approach that uses the trie data structure for efficient recognition. It uses approximate matching of the string for classification. During the preprocessing an input character image is transformed into a skeletonized image and discrete curves are found using a 3 x 3 pixel region. A trie, which we call as a sequence trie is used for a look up approach at a lower level to encode a discrete curve pattern of pixels. The sequence of such discrete curves from the input pattern is looked up in the sequence trie. The encoding of several such sequence numbers for the thinned character constructs a pattern string. Approximate string matching is used to compare the encoded pattern string from a template character with the pattern string obtained from the input character. We consider the approximate matching of the string instead of the exact matching to make the approach robust in the presence of noise. Another trie data structure (called pattern trie) is used for the efficient storage and retrieval for approximate matching of the string. We make use of the trie since it takes O(m) in worst case where m is the length of the longest string in the trie. For the approximate string matching we use look ahead with a branch and bound scheme in the trie. Here we apply our method on 43 Telugu characters from the basic Telugu characters for demonstration. The proposed approach has recognised all the test characters given here correctly, however more extensive testing on realistic data is required.