Structured exploration of who, what, when, and where in heterogeneous multimedia news sources

Authors:
Brendan Jou;Hongzhi Li;Joseph G. Ellis;Daniel Morozoff-Abegauz;Shih-Fu Chang
Affiliations:
Columbia University, New York, NY, USA;Columbia University, New York, NY, USA;Columbia University, New York, NY, USA;Columbia University, New York, NY, USA;Columbia University, New York, NY, USA
Venue:
Proceedings of the 21st ACM international conference on Multimedia
Year:
2013

Citing 8
Cited 0

Incorporating non-local information into information extraction systems by Gibbs sampling

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Google news personalization: scalable online collaborative filtering

Proceedings of the 16th international conference on World Wide Web
Taking the bite out of automated naming of characters in TV video

Image and Vision Computing
Graph construction and b-matching for semi-supervised learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Natural Language Processing with Python

Natural Language Processing with Python
Joint inference for cross-document information extraction

Proceedings of the 20th ACM international conference on Information and knowledge management
Multimodal Speaker Diarization

IEEE Transactions on Pattern Analysis and Machine Intelligence
Speaker Diarization: A Review of Recent Research

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a fully automatic system from raw data gathering to navigation over heterogeneous news sources, including over 18k hours of broadcast video news, 3.58M online articles, and 430M public Twitter messages. Our system addresses the challenge of extracting "who," "what," "when," and "where" from a truly multimodal perspective, leveraging audiovisual information in broadcast news and those embedded in articles, as well as textual cues in both closed captions and raw document content in articles and social media. Performed over time, we are able to extract and study the trend of topics in the news and detect interesting peaks in news coverage over the life of the topic. We visualize these peaks in trending news topics using automatically extracted keywords and iconic images, and introduce a novel multimodal algorithm for naming speakers in the news. We also present several intuitive navigation interfaces for interacting with these complex topic structures over different news sources.