Integrating knowledge sources in Devanagari text recognition system

Authors:
V. Bansal;R. M.K. Sinha
Affiliations:
Dept. of Ind. & Manage. Eng., Indian Inst. of Technol., Kanpur;-
Venue:
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Year:
2000

Citing 0
Cited 7

Databases for Research on Recognition of Handwritten Characters of Indian Scripts

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Handwritten character recognition using elastic matching and PCA

Proceedings of the International Conference on Advances in Computing, Communication and Control
Recognition of off-line handwritten devnagari characters using quadratic classifier

ICVGIP'06 Proceedings of the 5th Indian conference on Computer Vision, Graphics and Image Processing
Detection of structural concavities in character images--a writer-independent approach

PerMIn'12 Proceedings of the First Indo-Japan conference on Perception and Machine Intelligence
Offline handwritten Gurmukhi character recognition: study of different feature-classifier combinations

Proceeding of the workshop on Document Analysis and Recognition
Development of comprehensive devnagari numeral and character database for offline handwritten character recognition

Applied Computational Intelligence and Soft Computing
Recognition of Bangla compound characters using structural decomposition

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

The reading process has been widely studied and there is a general agreement among researchers that knowledge in different forms and at different levels plays a vital role. This is the underlying philosophy of the Devanagari document recognition system described in this work. The knowledge sources we use are mostly statistical in nature or in the form of a word dictionary tailored specifically for optical character recognition (OCR). We do not perform any reasoning on these. However, we explore their relative importance and role in the hierarchy. Some of the knowledge sources are acquired a priori by an automated training process while others are extracted from the text as it is processed. A complete Devanagari OCR system has been designed and tested with real-life printed documents of varying size and font. Most of the documents used were photocopies of the original. A performance of approximately 90% correct recognition is achieved