Video OCR: indexing digital new libraries by recognition of superimposed captions

Authors:
Toshio Sato;Takeo Kanade;Ellen K. Hughes;Michael A. Smith;Shin'ichi Satoh
Affiliations:
Toshiba Corporation, 70 Yanagi-cho, Saiwai-ku, Kawasaki 210-8501, Japan;School of Computer Science, Carnegie Mellon University, 500 Forbes Avenue, Pittsburgh, PA;School of Computer Science, Carnegie Mellon University, 500 Forbes Avenue, Pittsburgh, PA;School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA;National Center for Science Information Systems (NACSIS), 3-29-1 Otsuka, Bunkyo-ku, Tokyo, 112-8640, Japan
Venue:
Multimedia Systems - Special section on video libraries
Year:
1999

Citing 10
Cited 40

University computing services in 1995

ACM SIGUCCS Newsletter
A New Methodology for Gray-Scale Character Segmentation and Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence
Automatic text recognition for video indexing

MULTIMEDIA '96 Proceedings of the fourth ACM international conference on Multimedia
Semantic analysis for video contents extraction—spotting by association in news video

MULTIMEDIA '97 Proceedings of the fifth ACM international conference on Multimedia
Approximate String Matching

ACM Computing Surveys (CSUR)
Intelligent Access to Digital Video: Informedia Project

Computer
Recognizing Characters in Scene Images

IEEE Transactions on Pattern Analysis and Machine Intelligence
Character extraction of license plates from video

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Name-It: Association of Face and Name in Video

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)
Video Skimming and Characterization through the Combination of Image and Language Understanding Techniques

CVPR '97 Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97)

On the evolution of videotext description scheme and its validation experiments for MPEG-7

MULTIMEDIA '00 Proceedings of the 2000 ACM workshops on Multimedia
Detection of text captions in compressed domain video

MULTIMEDIA '00 Proceedings of the 2000 ACM workshops on Multimedia
A multi-view intelligent editor for digital video libraries

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Multilingual machine printed OCR

Hidden Markov models
Stratification Approach to Modeling Video

Multimedia Tools and Applications
Automatic News Video Caption Extraction and Recognition

IDEAL '00 Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents
Clustering of Imperfect Transcripts Using a Novel Similarity Measure

Information Retrieval Techniques for Speech Applications [this book is based on the workshop “Information Retrieval Techniques for Speech Applications”, held as part of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in New Orleans, USA, in September 2001].
Fast Text Caption Localization on Video Using Visual Rhythm

VISUAL '02 Proceedings of the 5th International Conference on Recent Advances in Visual Information Systems
Smart videotext: a video data model based on conceptual graphs

Multimedia Systems
Fast video segment retrieval by sort-merge feature selection, boundary refinement, and lazy evaluation

Computer Vision and Image Understanding - Special isssue on video retrieval and summarization
Naming every individual in news video monologues

Proceedings of the 12th annual ACM international conference on Multimedia
Video text recognition using sequential Monte Carlo and error voting methods

Pattern Recognition Letters
Multimodal content-based structure analysis of karaoke music

Proceedings of the 13th annual ACM international conference on Multimedia
Learning rich semantics from news video archives by style analysis

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Social navigation in web lectures

Proceedings of the seventeenth conference on Hypertext and hypermedia
The Semantic Pathfinder: Using an Authoring Metaphor for Generic Multimedia Indexing

IEEE Transactions on Pattern Analysis and Machine Intelligence
Text segmentation based on stroke filter

MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Text detection, localization, and tracking in compressed video

Image Communication
Fast communication: A new approach for text segmentation using a stroke filter

Signal Processing
Structuring low-quality videotaped lectures for cross-reference browsing by video text analysis

Pattern Recognition
An Automatic Method for Video Character Segmentation

ICIAR '08 Proceedings of the 5th international conference on Image Analysis and Recognition
A Heuristic Approach to Caption Enhancement for Effective Video OCR

ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Theoretical and Methodological Issues
A Novel Video Text Detection and Localization Approach

PCM '08 Proceedings of the 9th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
BVideoQA: Online English-Chinese bilingual video question answering

Journal of the American Society for Information Science and Technology
An Automatic Video Text Detection, Localization and Extraction Approach

Advanced Internet Based Systems and Applications
An image-based automatic Arabic translation system

Pattern Recognition
Accurate text localization in images based on SVM output scores

Image and Vision Computing
Vision-Based Text Segmentation System for Generic Display Units

IWINAC '09 Proceedings of the 3rd International Work-Conference on The Interplay Between Natural and Artificial Computation: Part II: Bioinspired Applications in Artificial and Natural Computation
Fast and robust text detection in images and video frames

Image and Vision Computing
A video text detection method based on key text points

PCM'10 Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I
Word spotting in the wild

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part I
An automated HSV based text tracking system from complex color video

ICDCIT'11 Proceedings of the 7th international conference on Distributed computing and internet technology
Adaptive fuzzy text segmentation in images with complex backgrounds using color and texture

CAIP'05 Proceedings of the 11th international conference on Computer Analysis of Images and Patterns
Content based image and video retrieval using embedded text

ACCV'06 Proceedings of the 7th Asian conference on Computer Vision - Volume Part II
Time and date OCR in CCTV video

ICIAP'05 Proceedings of the 13th international conference on Image Analysis and Processing
Annotating news video with locations

CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
A discontinuity adaptive method for super-resolution of license plates

ICVGIP'06 Proceedings of the 5th Indian conference on Computer Vision, Graphics and Image Processing
Interactive multimedia system for distance learning of higher education

Edutainment'06 Proceedings of the First international conference on Technologies for E-Learning and Digital Entertainment
Scene text recognition and tracking to identify athletes in sport videos

Multimedia Tools and Applications
A framework for improved video text detection and recognition

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

The automatic extraction and recognition of news captions and annotations can be of great help locating topics of interest in digital news video libraries. To achieve this goal, we present a technique, called Video OCR (Optical Character Reader), which detects, extracts, and reads text areas in digital video data. In this paper, we address problems, describe the method by which Video OCR operates, and suggest applications for its use in digital news archives. To solve two problems of character recognition for videos, low-resolution characters and extremely complex backgrounds, we apply an interpolation filter, multiframe integration and character extraction filters. Character segmentation is performed by a recognition-based segmentation method, and intermediate character recognition results are used to improve the segmentation. We also include a method for locating text areas using text-like properties and the use of a language-based postprocessing technique to increase word recognition rates. The overall recognition results are satisfactory for use in news indexing. Performing Video OCR on news video and combining its results with other video understanding techniques will improve the overall understanding of the news video content.