NEOCR: a configurable dataset for natural image text recognition

Authors:
Robert Nagy;Anders Dicker;Klaus Meyer-Wegener
Affiliations:
Computer Science 6 (Data Management), University of Erlangen-Nürnberg, Erlangen, Germany;Computer Science 6 (Data Management), University of Erlangen-Nürnberg, Erlangen, Germany;Computer Science 6 (Data Management), University of Erlangen-Nürnberg, Erlangen, Germany
Venue:
CBDAR'11 Proceedings of the 4th international conference on Camera-Based Document Analysis and Recognition
Year:
2011

Citing 10
Cited 1

Digital Image Warping

Digital Image Warping
Locating and Recognizing Text in WWW Images

Information Retrieval
ICDAR 2003 Robust Reading Competitions

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
Incremental detection of text on road signs from video with application to a driving assistant system

Proceedings of the 12th annual ACM international conference on Multimedia
Multimodal fusion using learned text concepts for image categorization

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
LabelMe: A Database and Web-Based Tool for Image Annotation

International Journal of Computer Vision
Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation

IEEE Transactions on Pattern Analysis and Machine Intelligence
A no-reference objective image sharpness metric based on the notion of just noticeable blur (JNB)

IEEE Transactions on Image Processing
Robust pre-processing techniques for OCR applications on mobile devices

Mobility '09 Proceedings of the 6th International Conference on Mobile Technology, Application & Systems
Word spotting in the wild

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I

Segmentation of Bangla words in scene images

Proceedings of the Eighth Indian Conference on Computer Vision, Graphics and Image Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently growing attention has been paid to recognizing text in natural images. Natural image text OCR is far more complex than OCR in scanned documents. Text in real world environments appears in arbitrary colors, font sizes and font types, often affected by perspective distortion, lighting effects, textures or occlusion. Currently there are no datasets publicly available which cover all aspects of natural image OCR. We propose a comprehensive well-annotated configurable dataset for optical character recognition in natural images for the evaluation and comparison of approaches tackling with natural image text OCR. Based on the rich annotations of the proposed NEOCR dataset new and more precise evaluations are now possible, which give more detailed information on where improvements are most required in natural image text OCR.