Skew detection for complex document images using robust borderlines in both text and non-text regions

Authors:
Hong Liu;Qi Wu;Hongbin Zha;Xueping Liu
Affiliations:
National Laboratory on Machine Perception, Peking University, Beijing 100871, China and Shenzhen Graduate School, Peking University, Beijing 100871, China;National Laboratory on Machine Perception, Peking University, Beijing 100871, China;National Laboratory on Machine Perception, Peking University, Beijing 100871, China;Ricoh Co., Japan
Venue:
Pattern Recognition Letters
Year:
2008

Citing 13
Cited 3

Automated entry system for printed documents

Pattern Recognition
Skew correction of document images using interline cross-correlation

CVGIP: Graphical Models and Image Processing
Skew Angle Detection of Digitized Indian Script Documents

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Document Spectrum for Page Layout Analysis

IEEE Transactions on Pattern Analysis and Machine Intelligence
Skew detection and correction in document images based on straight-line fitting

Pattern Recognition Letters
A Robust Skew Detection Algorithm for Grayscale Document Image

ICDAR '99 Proceedings of the Fifth International Conference on Document Analysis and Recognition
Document Skew Detection Using Minimum-Area Bounding Rectangle

ITCC '00 Proceedings of the The International Conference on Information Technology: Coding and Computing (ITCC'00)
A nearest-neighbor chain based approach to skew estimation in document images

Pattern Recognition Letters
Skew Detection for Complex Document Images Using Fuzzy Runlength

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Robust Skew Detection in mixed Text/Graphics Documents

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Skew Detection in Binary Image Documents Based on Image Dilation and Region labeling Approach

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
Third International Conference on Intelligent Sensing and Information Processing

ICISIP '05 Proceedings of the 2005 3rd International Conference on Intelligent Sensing and Information Processing
A method of detecting the orientation of aligned components

Pattern Recognition Letters

Restoring Chinese documents images based on text boundary lines

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Skew estimation of document images using bagging

IEEE Transactions on Image Processing
Resource-efficient FPGA architecture and implementation of hough transform

IEEE Transactions on Very Large Scale Integration (VLSI) Systems

Quantified Score

Hi-index	0.10

Visualization

Abstract

A new skew detection method for complex document images based on robust borderlines extracted from both text and non-text regions is proposed in this paper. First, borderlines are extracted from the borders of large connected components in a document image by using a run length based method. Second, after filtering out non-linear borderlines, a fast iteration algorithm is applied to optimize each linear borderline's directional angle. Finally, the weighted median value of all the directional angles is calculated as the skew angle of the whole document. Experiments on 2000 various skew document images are implemented. Total correct rate is 95.2%, and the detecting time on average is less than 0.2s for each document. The proposed skew detection method is efficient for complex documents with horizontal and vertical text layout, three kinds of linguistic characters in English, Japanese and Chinese, especially for documents with predominant non-text regions or sparse text regions.