Can we build language-independent OCR using LSTM networks?

Authors:
Adnan Ul-Hasan;Thomas M. Breuel
Affiliations:
Technical University of Kaiserslautern, Kaiserslautern, Germany;Technical University of Kaiserslautern, Kaiserslautern, Germany
Venue:
Proceedings of the 4th International Workshop on Multilingual OCR
Year:
2013

Citing 6
Cited 0

Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks

ICML '06 Proceedings of the 23rd international conference on Machine learning
Long Short-Term Memory

Neural Computation
An Overview of the Tesseract OCR Engine

ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
A Novel Connectionist System for Unconstrained Handwriting Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Adapting the Tesseract open source OCR engine for multilingual OCR

Proceedings of the International Workshop on Multilingual OCR
Learning long-term dependencies with gradient descent is difficult

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Language models or recognition dictionaries are usually considered an essential step in OCR. However, using a language model complicates training of OCR systems, and it also narrows the range of texts that an OCR system can be used with. Recent results have shown that Long Short-Term Memory (LSTM) based OCR yields low error rates even without language modeling. In this paper, we explore the question to what extent LSTM models can be used for multilingual OCR without the use of language models. To do this, we measure cross-language performance of LSTM models trained on different languages. LSTM models show good promise to be used for language-independent OCR. The recognition errors are very low (around 1%) without using any language model or dictionary correction.