Segmentation of page images using the area Voronoi diagram
Computer Vision and Image Understanding - Special issue on document image understanding and retrieval
The Document Spectrum for Page Layout Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
On Segmentation of Documents in Complex Scripts
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Content-level Annotation of Large Collection of Printed Document Images
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Image segmentation evaluation: A survey of unsupervised methods
Computer Vision and Image Understanding
Performance Evaluation and Benchmarking of Six-Page Segmentation Algorithms
IEEE Transactions on Pattern Analysis and Machine Intelligence
Voronoi++: A Dynamic Page Segmentation Approach Based on Voronoi and Docstrum Features
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
ICDAR 2009 Page Segmentation Competition
ICDAR '09 Proceedings of the 2009 10th International Conference on Document Analysis and Recognition
IBM Journal of Research and Development
Automatic localization of page segmentation errors
Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
Fringe Map Based Text Line Segmentation of Printed Telugu Document Images
ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
Hi-index | 0.00 |
Text line segmentation is a basic step in any OCR system. Its failure deteriorates the performance of OCR engines. This is especially true for the Indian languages due to the nature of scripts. Many segmentation algorithms are proposed in literature. Often these algorithms fail to adapt dynamically to a given page and thus tend to yield poor segmentation for some specific regions or some specific pages. In this work we design a text line segmentation post processor which automatically localizes and corrects the segmentation errors. The proposed segmentation post processor, which works in a "learning by examples" framework, is not only independent to segmentation algorithms but also robust to the diversity of scanned pages. We show over 5% improvement in text line segmentation on a large dataset of scanned pages for multiple Indian languages.