A knowledge-based system for extracting text-lines from mixed and overlapping text/graphics compound document images

Authors:
Yen-Lin Chen;Zeng-Wei Hong;Cheng-Hung Chuang
Affiliations:
Department of Computer Science and Information Engineering, National Taipei University of Technology, 1, Sec. 3, Chung-hsiao E. Rd., Taipei 10608, Taiwan;Department of Computer Science and Information Engineering, Asia University, 500 Liufeng Rd., Wufeng, Taichung 41354, Taiwan;Department of Computer Science and Information Engineering, Asia University, 500 Liufeng Rd., Wufeng, Taichung 41354, Taiwan
Venue:
Expert Systems with Applications: An International Journal
Year:
2012

Citing 14
Cited 1

A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
Text segmentation using Gabor filters for automatic document processing

Machine Vision and Applications - Special issue: document image analysis techniques
Page segmentation and classification

CVGIP: Graphical Models and Image Processing
The indexing and retrieval of document images: a survey

Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
TextFinder: An Automatic System to Detect and Recognize Text In Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
Geometric Structure Analysis of Document Images: A Knowledge-Based Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Linear-time connected-component labeling based on sequential local operations

Computer Vision and Image Understanding
Text Extraction from Gray Scale Document Images Using Edge Information

ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
A Discriminant Analysis Based Recursive Automatic Thresholding Approach for Image Segmentation

IEICE - Transactions on Information and Systems
An expert system based on fuzzy entropy for automatic threshold selection in image processing

Expert Systems with Applications: An International Journal
HebbR2-Taffic: A novel application of neuro-fuzzy network for visual based traffic monitoring system

Expert Systems with Applications: An International Journal
Expert system segmentation of face images

Expert Systems with Applications: An International Journal
A multi-plane approach for text segmentation of complex document images

Pattern Recognition
Dynamic Measurement of Computer Generated Image Segmentations

IEEE Transactions on Pattern Analysis and Machine Intelligence

Segmentation of connected handwritten digits using Self-Organizing Maps

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	12.05

Visualization

Abstract

This paper presents a new knowledge-based system for extracting and identifying text-lines from various real-life mixed text/graphics compound document images. The proposed system first decomposes the document image into distinct object planes to separate homogeneous objects, including textual regions of interest, non-text objects such as graphics and pictures, and background textures. A knowledge-based text extraction and identification method obtains the text-lines with different characteristics in each plane. The proposed system offers high flexibility and expandability by merely updating new rules to cope with various types of real-life complex document images. Experimental and comparative results prove the effectiveness of the proposed knowledge-based system and its advantages in extracting text-lines with a large variety of illumination levels, sizes, and font styles from various types of mixed and overlapping text/graphics complex compound document images.