Cooperation of multi-layer perceptrons for the estimation of skew angle in text document images
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Artificial Neural Networks for Document Analysis and Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Document zone content classification and its performance evaluation
Pattern Recognition
Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms
IEEE Transactions on Pattern Analysis and Machine Intelligence
Truthing for Pixel-Accurate Segmentation
DAS '08 Proceedings of the 2008 The Eighth IAPR International Workshop on Document Analysis Systems
ICDAR 2009 Page Segmentation Competition
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
Robust text and drawing segmentation algorithm for historical documents
Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing
A Time Efficient Clustering Algorithm for Gray Scale Image Segmentation
International Journal of Computer Vision and Image Processing
International Journal of Computer Vision and Image Processing
Hi-index | 0.00 |
Segmentation of a document image into text and non-text regions is an important preprocessing step for a variety of document image analysis tasks, like improving OCR, document compression etc. Most of the state-of-the-art document image segmentation approaches perform segmentation using pixel-based or zone(block)-based classification. Pixel-based classification approaches are time consuming, whereas block-based methods heavily depend on the accuracy of block segmentation step. In contrast to the state-of-the-art document image segmentation approaches, our segmentation approach introduces connected component based classification, thereby not requiring a block segmentation beforehand. Here we train a self-tunable multi-layer perceptron (MLP) classifier for distinguishing between text and non-text connected components using shape and context information as a feature vector. Experimental results prove the effectiveness of our proposed algorithm. We have evaluated our method on subset of UW-III, ICDAR 2009 page segmentation competition test images and circuit diagrams datasets and compared its results with the state-of-the-art leptonica's page segmentation algorithm.