An OCR System to Read Two Indian Language Scripts: Bangla and Devnagari (Hindi)

Authors:
B. B. Chaudhuri;U. Pal
Affiliations:
-;-
Venue:
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
Year:
1997

Citing 0
Cited 17

A Recognition System for Devnagri and English Handwritten Numerals

ICMI '00 Proceedings of the Third International Conference on Advances in Multimodal Interfaces
On OCR of Degraded Documents Using Fuzzy Multifactorial Analysis

AFSS '02 Proceedings of the 2002 AFSS International Conference on Fuzzy Systems. Calcutta: Advances in Soft Computing
A Bilingual OCR for Hindi-Telugu Documents and its Applications

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Compression of scan-digitized Indian language printed text: a soft pattern matching technique

Proceedings of the 2003 ACM symposium on Document engineering
Adaptive Hindi OCR using generalized Hausdorff image comparison

ACM Transactions on Asian Language Information Processing (TALIP)
Zone Identification in the Printed Gujarati Text

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
An Approach for Stemming in Symbolically Compressed Indian Language Imaged Documents

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Configurable Hybrid Architectures for Character Recognition Applications

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Fuzzy model based recognition of handwritten numerals

Pattern Recognition
Rotation, scale and translation invariant handwritten Devanagari numeral character recognition using general fuzzy neural network

Pattern Recognition
Optical character recognition for printed Hindi text in Devnagari using soft-computing technique

AIAP'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications
Multilingual OCR system for South Indian scripts and English documents: An approach based on Fourier transform and principal component analysis

Engineering Applications of Artificial Intelligence
Fuzzy stroke analysis of Devnagari handwritten characters

WSEAS Transactions on Computers
Segmentation of touching characters in upper zone in printed Gurmukhi script

Proceedings of the 2nd Bangalore Annual Compute Conference
Digit extraction and recognition from machine printed Gurmukhi documents

Proceedings of the International Workshop on Multilingual OCR
Off-line handwritten word recognition using multi-stream hidden Markov models

Pattern Recognition Letters
On performance analysis of end-to-end OCR systems of Indic scripts

Proceeding of the workshop on Document Analysis and Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

An OCR system is proposed that can read two Indian language scripts: Bangla and Devnagari (Hindi), the most popular ones in Indian subcontinent. These scripts, having the same origin in ancient Brahmi script, have many features in common and hence a single system can be modeled to recognize them. In the proposed model, document digitization, skew detection, text line segmentation and zone separation, word and character segmentation, character grouping into basic, modifier and compound character category are done for both scripts by the same set of algorithms. The feature sets and classification tree as well as knowledge base required for error correction (such as lexicon) differ for Bangla and Devnagari. The system shows a good performance for single font scripts printed on clear document.