An intelligent method to extract characters in color document with highlight regions

Authors:
Chun-Ming Tsai
Affiliations:
Department of Computer Science, Taipei Municipal University of Education, Taipei, Taiwan
Venue:
IEA/AIE'11 Proceedings of the 24th international conference on Industrial engineering and other applications of applied intelligent systems conference on Modern approaches in applied intelligence - Volume Part II
Year:
2011

Citing 12
Cited 1

A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Prototype Document Image Analysis System for Technical Journals

Computer
Geometric Structure Analysis of Document Images: A Knowledge-Based Approach

IEEE Transactions on Pattern Analysis and Machine Intelligence
Parameter-Free Geometric Document Layout Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Introduction to Digital Image Processing

An Introduction to Digital Image Processing
Reports of the DAS02 working groups

International Journal on Document Analysis and Recognition
Document image binarization by two-stage block extraction and background intensity determination

Pattern Analysis & Applications
A multi-plane approach for text segmentation of complex document images

Pattern Recognition
A binarization method with learning-built rules for document images produced by cameras

Pattern Recognition
Efficiently extracting and classifying objects for analyzing color documents

Machine Vision and Applications
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Binarization of color document images via luminance and saturation color features

IEEE Transactions on Image Processing

Intelligent post-processing via bounding-box-based morphological operations for moving objects detection

IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most conventional characters extraction methods include binarization (background determination), region segmentation, and region identification. Incorrect binarization results adversely influence the segmentation and identification results. This can be a problem when color documents are printed with different background color regions as the binarization will not have effective threshold results and subsequent segmentation and identification steps will not work properly. Conventional region segmentation methods are time-consuming for large document images. Conventional region identification methods are applied for the preceding segmentation results, using a bottom-up method. This study presents an intelligent method to solve these problems, which integrates background determination, region segmentation, and region identification to extract characters in color documents with highlight regions. The results demonstrate that the proposed method is more effective and efficient than other methods in terms of binarization results, extraction results, and computational performance.