A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Text segmentation using Gabor filters for automatic document processing
Machine Vision and Applications - Special issue: document image analysis techniques
Page segmentation and classification
CVGIP: Graphical Models and Image Processing
The indexing and retrieval of document images: a survey
Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
TextFinder: An Automatic System to Detect and Recognize Text In Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Geometric Structure Analysis of Document Images: A Knowledge-Based Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
Linear-time connected-component labeling based on sequential local operations
Computer Vision and Image Understanding
Text Extraction from Gray Scale Document Images Using Edge Information
ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
A Discriminant Analysis Based Recursive Automatic Thresholding Approach for Image Segmentation
IEICE - Transactions on Information and Systems
An expert system based on fuzzy entropy for automatic threshold selection in image processing
Expert Systems with Applications: An International Journal
HebbR2-Taffic: A novel application of neuro-fuzzy network for visual based traffic monitoring system
Expert Systems with Applications: An International Journal
Expert system segmentation of face images
Expert Systems with Applications: An International Journal
A multi-plane approach for text segmentation of complex document images
Pattern Recognition
Dynamic Measurement of Computer Generated Image Segmentations
IEEE Transactions on Pattern Analysis and Machine Intelligence
Segmentation of connected handwritten digits using Self-Organizing Maps
Expert Systems with Applications: An International Journal
Hi-index | 12.05 |
This paper presents a new knowledge-based system for extracting and identifying text-lines from various real-life mixed text/graphics compound document images. The proposed system first decomposes the document image into distinct object planes to separate homogeneous objects, including textual regions of interest, non-text objects such as graphics and pictures, and background textures. A knowledge-based text extraction and identification method obtains the text-lines with different characteristics in each plane. The proposed system offers high flexibility and expandability by merely updating new rules to cope with various types of real-life complex document images. Experimental and comparative results prove the effectiveness of the proposed knowledge-based system and its advantages in extracting text-lines with a large variety of illumination levels, sizes, and font styles from various types of mixed and overlapping text/graphics complex compound document images.