Document image analysis
An improved document skew angle estimation technique
Pattern Recognition Letters
Twenty Years of Document Image Analysis in PAMI
IEEE Transactions on Pattern Analysis and Machine Intelligence
Skew angle estimation in document processing using Cohen's class distributions
Pattern Recognition Letters - Special issue on pattern recognition in practice VI
The Document Spectrum for Page Layout Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
SLIDE: Subspace-Based Line Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Skew and Slant Correction for Document Images Using Gradient Direction
ICDAR '97 Proceedings of the 4th International Conference on Document Analysis and Recognition
A simple and efficient skew detection algorithm via text row accumulation
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Multi-Level Component Grouping Algorithm and Its Applications
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Fiducial line based skew estimation
Pattern Recognition
A method of detecting the orientation of aligned components
Pattern Recognition Letters
Skew estimation of document images using bagging
IEEE Transactions on Image Processing
Document skew estimation: an approach based on wavelets
Proceedings of the 2011 International Conference on Communication, Computing & Security
Transform invariant text extraction
The Visual Computer: International Journal of Computer Graphics
Hi-index | 0.01 |
Skew estimation and page segmentation are the two closely related processing stages for document image analysis. Skew estimation needs proper page segmentation, especially for document images with multiple skews that are common in scanned images from thick bound publications in 2-up style or postal envelopes with various printed labels. Even if only a single skew is concerned for a document image, the presence of minority regions of different skews or undefined skew such as noise may severely affect the estimation for the dominant skew. Page segmentation, on the other hand, may need to know the exact skew angle of a page in order to work properly. This paper presents a skew estimation method with built-in skew-independent segmentation functionality that is capable of handling document images with multiple regions of different skews. It is based on the convex hulls of the individual components (i.e. the smallest convex polygon that fully contains a component) and that of the component groups (i.e. the smallest convex polygon that fully contain all the components in a group) in a document image. The proposed method first extracts the convex hulls of the components, segments an image into groups of components according to both the spatial distances and size similarities among the convex hulls of the components. This process not only extracts the hints of the alignments of the text groups, but also separate noise or graphical components from that of the textual ones. To verify the proposed algorithms, the full sets of the real and the synthetic samples of the University of Washington English Document Image Database I (UW-I) are used. Quantitative and qualitative comparisons with some existing methods are also provided.