ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Abramowitz and stegun: a resource for mathematical document analysis
CICM'12 Proceedings of the 11th international conference on Intelligent Computer Mathematics
Patent images - a glass-encased tool: opening the case
Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies
Hi-index | 0.01 |
This paper is concerned with research on OCR (optical character recognition) of printed mathematical expressions. Construction of a representative corpus of technical and scientific documents containing expressions is discussed. A statistical investigation of the corpus is presented, and usefulness of this analysis is demonstrated in the related research problems, namely, (i) identification and segmentation of expression zones from the rest of the document, (ii) recognition of expression symbols, (iii) interpretation of expression structures, and (iv) performance evaluation of a mathematical expression recognition system. Moreover, a groundtruthing format has been proposed to facilitate automatic evaluation of expression recognition techniques.