Text Region Extraction and Text Segmentation on Cameracaptured Document Style Images

Authors:
Y. J. Song;K. C. Kim;Y. W. Y. W. Choi;H. R. Byun;S. H. Kim;S. Y. Chi;D. K. Jang;Y. K. Chung
Affiliations:
Sookmyung Women's University, Korea;Yonsei University, Korea;Sookmyung Women's University, Korea;Yonsei University, Korea;University of Denver, USA;Visual Recognition Research Team, Electronics and Telecommunications Research Institute, Korea;Visual Recognition Research Team, Electronics and Telecommunications Research Institute, Korea;Visual Recognition Research Team, Electronics and Telecommunications Research Institute, Korea
Venue:
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Year:
2005

Citing 8
Cited 0

Text segmentation using Gabor filters for automatic document processing

Machine Vision and Applications - Special issue: document image analysis techniques
Support-Vector Networks

Machine Learning
Binarising Camera Images for OCR

ICDAR '01 Proceedings of the Sixth International Conference on Document Analysis and Recognition
Progress in Camera-Based Document Image Analysis

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
The Multistage Approach to Information Extraction in Degraded Document Images

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1 - Volume 01
Scene Text Extraction in Natural Scene Images using Hierarchical Feature Combining and Verification

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 2 - Volume 02
Automatic text detection and tracking in digital video

IEEE Transactions on Image Processing
Morphological text extraction from images

IEEE Transactions on Image Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a text extraction method from camera-captured document style images and propose a text segmentation method based on a color clustering method. The proposed extraction method detects text regions from the images using two low-level image features and verifies the regions through a high-level text stroke feature. The two level features are combined hierarchically. The low-level features are intensity variation and color variance. And, we use text strokes as a highlevel feature using multi-resolution wavelet transforms on local image areas. The stroke feature vector is an input to a SVM (Support Vector Machine) for verification, when needed. The proposed text segmentation method uses color clustering to the extracted text regions. We improved K-means clustering method and it selects K and initial seed values automatically. We tested the proposed methods with various document style images captured by three different cameras. We confirmed that the extraction rates are good enough to be used in real-life applications.