Text - Image Separation in Devanagari Documents

Authors:
Swapnil Khedekar;Vemulapati Ramanaprasad;Srirangaraj Setlur;Venugopal Govindaraju
Affiliations:
-;-;-;-
Venue:
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Year:
2003

Citing 12
Cited 4

Fundamentals of digital image processing

Fundamentals of digital image processing
Tracking text in mixed-mode documents

DOCPROCS '88 Proceedings of the ACM conference on Document processing systems
Classification of newspaper image blocks using texture analysis

Computer Vision, Graphics, and Image Processing
Text segmentation using Gabor filters for automatic document processing

Machine Vision and Applications - Special issue: document image analysis techniques
Finding text in images

DL '97 Proceedings of the second ACM international conference on Digital libraries
Geometric Structure Analysis of Document Images: A Knowledge-Based Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Extraction of text areas in printed document images

DocEng '01 Proceedings of the 2001 ACM Symposium on Document engineering
A Statistically Based, Highly Accurate Text-Line Segmentation Method

ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Zone Classification Using Texture Features

ICPR '96 Proceedings of the International Conference on Pattern Recognition (ICPR '96) Volume III-Volume 7276 - Volume 7276
Text Extraction from Gray Scale Document Images Using Edge Information

ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Newspaper Document Analysis Featuring Connected Line Segmentation

ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Document analysis system

IBM Journal of Research and Development

Fuzzy model based recognition of handwritten numerals

Pattern Recognition
Robust frame and text extraction from comic books

GREC'11 Proceedings of the 9th international conference on Graphics Recognition: new trends and challenges
Texture feature evaluation for segmentation of historical document images

Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing
Text graphic separation in Indian newspapers

Proceedings of the 4th International Workshop on Multilingual OCR

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a top-down, projection-profilebased algorithm to separate text blocks from image blocksin a Devanagari document. We use a distinctive feature ofDevanagari text, called Shirorekha (Header Line) to analyzethe pattern produced by Devanagari text in the horizontalprofile. The horizontal profile corresponding to a textblock possesses certain regularity in frequency, orientationand shows spatial cohesion. The algorithm uses these featuresto identify text blocks in a document image containingboth text and graphics.