XML---an opportunity for meaningful data standards in the geosciences
Computers & Geosciences
XML Seen as Integral to Application Integration
IT Professional
Schema Extraction for Multimedia XML Document Retrieval
WISE '00 Proceedings of the First International Conference on Web Information Systems Engineering (WISE'00)-Volume 2 - Volume 2
An XML Based Framework for Cognitive Vision Architectures
ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1 - Volume 01
Representation and Annotation of Online Handwritten Data
IWFHR '04 Proceedings of the Ninth International Workshop on Frontiers in Handwriting Recognition
UPX: A New XML Representation for Annotated Datasets of Online Handwriting Data
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Content-level Annotation of Large Collection of Printed Document Images
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
The hOCR Microformat for OCR Workflow and Results
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Multimedia ontology learning for automatic annotation and video browsing
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Experiences of integration and performance testing of multilingual OCR for printed Indian scripts
Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
Hi-index | 0.00 |
This paper presents an XML-based scheme for managing a large multilingual OCR project. In particular we describe how a new XML based tagging scheme has been exploited to achieve the objectives of the project. Managing a large multi-lingual OCR project involving multiple research groups, developing script specific and script independent technologies in a collaborative fashion is a challenging problem. In this paper, we present some of the software and data management strategies designed for the project aimed at developing OCR for 11 scripts of Indian origin for which mature OCR technology was not available.