Geometric Structure Analysis of Document Images: A Knowledge-Based Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
Parameter-Free Geometric Document Layout Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
Scale Space Technique for Word Segmentation in Handwritten Documents
SCALE-SPACE '99 Proceedings of the Second International Conference on Scale-Space Theories in Computer Vision
Structure Analysis of Low Resolution Fax Cover Pages
DAS '98 Selected Papers from the Third IAPR Workshop on Document Analysis Systems: Theory and Practice
Newspaper document analysis featuring connected line segmentation
VIP '01 Proceedings of the Pan-Sydney area workshop on Visual information processing - Volume 11
Automatic Feature Selection with Applications to Script Identification of Degraded Documents
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
A Scale Space Approach for Automatically Segmenting Words from Historical Handwritten Documents
IEEE Transactions on Pattern Analysis and Machine Intelligence
Document identification using shape trees
International Journal of Hybrid Intelligent Systems
A multi-plane approach for text segmentation of complex document images
Pattern Recognition
Frame segmentation used MLP-based X-Y recursive for mobile cartoon content
HCI'07 Proceedings of the 12th international conference on Human-computer interaction: intelligent multimodal interaction environments
Text area detection in digital documents images using textural features
CAIP'07 Proceedings of the 12th international conference on Computer analysis of images and patterns
Writer identification by handwritten text analysis
ICOSSE'06 Proceedings of the 5th WSEAS international conference on System science and simulation in engineering
A restoration and segmentation unit for the historic persian documents
ACIVS'05 Proceedings of the 7th international conference on Advanced Concepts for Intelligent Vision Systems
Segmentation-free detection of comic panels
ICCVG'12 Proceedings of the 2012 international conference on Computer Vision and Graphics
Hi-index | 0.00 |
This paper describes a method for extracting words, textlines and text blocks by analyzing the spatial configuration of bounding boxes of connected component on a given document image. The basic idea is that connected components of black pixels can be used as computational units in document image analysis. In this paper, the problem of extracting words, textlines and text blocks is viewed as a clustering problem in the 2-dimensional discrete domain. Our main strategy is that profiling analysis is utilized to measure horizontal or vertical gaps of (groups of) components during the process of image segmentation. For this purpose, we compute the smallest rectangular box, called the bounding box, which circumscribes a connected component. Those boxes are projected horizontally and/or vertically, and local and global projection profiles are analyzed for word, textline and text-block segmentation. In the last step of segmentation, the document decomposition hierarchy is produced from these segmented objects.