Text detection on charts and graphs

Authors:
N. Vassilieva;Yu. Gladysheva
Affiliations:
HP Labs, St. Petersburg, Russia 191194;Saint Petersburg State University, Peterhof, St. Petersburg, Russia 198504
Venue:
Pattern Recognition and Image Analysis
Year:
2011

Citing 2
Cited 0

A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Overview of the Tesseract OCR Engine

ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current Optical Character Recognition (OCR) systems are not capable of detection and recognition of detached words on an image, especially if the text is not located horizontally. Such text blocks are typical of charts and graphs. In this paper an algorithm of detection of small text blocks with arbitrary orientation, color, style, and font size, which can be used for text localization before application of arbitrary character recognition system, is proposed. According to the experimental results, the use of the proposed algorithm for determination of the location and orientation of text blocks on charts and graphs and the transmission of this information to text recognition system allow increasing the fullness by 20 times and the text recognition precision by 15 times. The experiments were carried out on a test collection of 1000 charts containing about 14 000 text blocks, which was created by means of the XML/SWF Chart tool.