Multilingual OCR research and applications: an overview

  • Authors:
  • Xujun Peng;Huaigu Cao;Srirangaraj Setlur;Venu Govindaraju;Prem Natarajan

  • Affiliations:
  • Raytheon BBN Technologies, Cambridge, MA;Raytheon BBN Technologies, Cambridge, MA;CUBS, University at Buffalo, Buffalo, NY;CUBS, University at Buffalo, Buffalo, NY;Univ. of Southern California, Marina del Rey, CA

  • Venue:
  • Proceedings of the 4th International Workshop on Multilingual OCR
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper offers an overview of the current approaches to research in the field of off-line multilingual OCR. Typically, off-line OCR systems are designed for a particular script or language. However, the ideal approach to multilingual OCR would likely be to develop a system that can, with the use of language-specific training data, be re-targeted to process different languages with minimal modifications. This is still an open area of research with plenty of challenges. This is particularly true for multilingual handwriting recognition due to the added complexity of variations in writing styles even within the same scripts. Challenges for multilingual OCR in preprocessing, feature extraction, script identification and recognition modeling and a brief survey of research in these areas are presented in the paper. Ideas for future research in multilingual OCR are outlined.