Page segmentation and classification utilising a bottom-up approach
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Recursive X-Y cut using bounding boxes of connected components
ICDAR '95 Proceedings of the Third International Conference on Document Analysis and Recognition (Volume 2) - Volume 2
Hi-index | 0.00 |
This paper presents the speed-up method for document page segmentation which is one of the most important processes in an Optical Character Recognition (OCR) system. In this proposed scheme, a window size of 12 by 12 pixels is used to find a black pixel and its contour border. Then, the optimum image is created from these borders of characters where the 12×12 pixels of the original picture are represented by 1 pixel in the optimum image. Therefore, the number of pixels is reduced to 1/144 times the original image but still keeps the original image structure correctly. Finally, the optimum image is used for block extraction process to provide the faster work result. The experimental results show that the proposed scheme can significantly speed up the processing time of the document page segmentation process.