A comprehensive neural-based approach for text recognition in videos using natural language processing

  • Authors:
  • Khaoula Elagouni;Christophe Garcia;Pascale Sébillot

  • Affiliations:
  • Orange Labs R&D, rue du Clos Courtel, Cesson-Séévigné Cedex, France;LIRIS, Insa de Lyon, Bât. Jules Verne, Villeurbanne Cedex, France;IRISA, Insa de Rennes, Campus de Beaulieu, Rennes Cedex, France

  • Venue:
  • Proceedings of the 1st ACM International Conference on Multimedia Retrieval
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This work aims at helping multimedia content understanding by deriving benefit from textual clues embedded in digital videos. For this, we developed a complete video Optical Character Recognition system (OCR), specifically adapted to detect and recognize embedded texts in videos. Based on a neural approach, this new method outperforms related work, especially in terms of robustness to style and size variabilities, to background complexity and to low resolution of the image. A language model that drives several steps of the video OCR is also introduced in order to remove ambiguities due to a local letter by letter recognition and to reduce segmentation errors. This approach has been evaluated on a database of French TV news videos and achieves an outstanding character recognition rate of 95%, corresponding to 78% of words correctly recognized, which enables its incorporation into an automatic video indexing and retrieval system.