TextFinder: An Automatic System to Detect and Recognize Text In Images
IEEE Transactions on Pattern Analysis and Machine Intelligence
Neural network-based text location in color images
Pattern Recognition Letters
Intelligent Indexing and Semantic Retrieval of Multimodal Documents
Information Retrieval
Dominant Color Region Based Indexing for CBIR
ICIAP '99 Proceedings of the 10th International Conference on Image Analysis and Processing
Locating Characters in Scene Images Using Frequency Features
ICPR '02 Proceedings of the 16 th International Conference on Pattern Recognition (ICPR'02) Volume 3 - Volume 3
Robust Extraction of Text in Video
ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 1
Text Enhancement with Asymmetric Filter for Video OCR
ICIAP '01 Proceedings of the 11th International Conference on Image Analysis and Processing
A new video images text localization approach based on a fast hough transform
ICIAR'06 Proceedings of the Third international conference on Image Analysis and Recognition - Volume Part I
Localizing and segmenting text in images and videos
IEEE Transactions on Circuits and Systems for Video Technology
A comprehensive method for multilingual video text detection, localization, and extraction
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.00 |
Text in videos contains much semantic information that can be used for video indexing and browsing. In this paper, we propose a spatiotemporal video-text localization and identification approach which proceeds in two main steps: text region localization and text region identification. In the first step we detect the significant appearance of the new objects in a frame by a split and merge processes applied on binarized edge frame pair differences. Detected objects are, a priori, considered as text. They are then filtered according to both local contrast and texture criteria in order to get the effective ones. The resulted text regions are identified based on a visual grammar descriptor containing a set of semantic text class regions characterized by visual features. A visual table of content is generated based on extracted text regions occurring within video sequence enriched by a semantic identification. The experimentation performed on a variety of video sequences shows the efficiency of our.