NEOCR: a configurable dataset for natural image text recognition

  • Authors:
  • Robert Nagy;Anders Dicker;Klaus Meyer-Wegener

  • Affiliations:
  • Computer Science 6 (Data Management), University of Erlangen-Nürnberg, Erlangen, Germany;Computer Science 6 (Data Management), University of Erlangen-Nürnberg, Erlangen, Germany;Computer Science 6 (Data Management), University of Erlangen-Nürnberg, Erlangen, Germany

  • Venue:
  • CBDAR'11 Proceedings of the 4th international conference on Camera-Based Document Analysis and Recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently growing attention has been paid to recognizing text in natural images. Natural image text OCR is far more complex than OCR in scanned documents. Text in real world environments appears in arbitrary colors, font sizes and font types, often affected by perspective distortion, lighting effects, textures or occlusion. Currently there are no datasets publicly available which cover all aspects of natural image OCR. We propose a comprehensive well-annotated configurable dataset for optical character recognition in natural images for the evaluation and comparison of approaches tackling with natural image text OCR. Based on the rich annotations of the proposed NEOCR dataset new and more precise evaluations are now possible, which give more detailed information on where improvements are most required in natural image text OCR.