Multi-modal information retrieval from broadcast video using OCR and speech recognition
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Learning query-class dependent weights in automatic video retrieval
Proceedings of the 12th annual ACM international conference on Multimedia
Probabilistic latent query analysis for combining multiple retrieval sources
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Scale-rotation invariant pattern entropy for keypoint-based near-duplicate detection
IEEE Transactions on Image Processing
Representations of Keypoint-Based Semantic Concept Detection: A Comprehensive Study
IEEE Transactions on Multimedia
Hi-index | 0.00 |
We propose a multi-modal approach to retrieve associated news stories sharing the same main topic. In the textual domain, we utilize Automatic Speech Recognition (ASR) and refined Optical Character Recognition (OCR) transcripts while in the visual domain we employ a Near Duplicate Keyframe detection method to identify stories with common visual clues. In addition, we adopt another visual representation namely semantic signature, indicating pre-defined semantic concepts included in the news story, to improve the discriminativness of visual modality. We propose a query-class weighting scheme to integrate the retrieval outcomes gained from visual modalities. Experimental results show the distinguishing power of the enhanced representation in individual modalities and the superiority of our fusion approach performance compared to existing strategies.