Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
The LIMSI Broadcast News transcription system
Speech Communication - Special issue on automatic transcription of broadcast news data
A detection-based approach to broadcast news video story segmentation
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
RoleNet: movie analysis from the perspective of social networks
IEEE Transactions on Multimedia - Special issue on integration of context and content
On a strategy for spectral clustering with parallel computation
VECPAR'10 Proceedings of the 9th international conference on High performance computing for computational science
Multistage speaker diarization of broadcast news
IEEE Transactions on Audio, Speech, and Language Processing
AMVA'12: ACM international workshop on audio and multimedia methods for large-scale video analysis
Proceedings of the 20th ACM international conference on Multimedia
Hi-index | 0.00 |
Since the 90's, TV series tend to introduce more and more main characters and they are often composed of multiple intertwined stories. In this paper, we propose a hierarchical framework of plot de-interlacing which permits to cluster semantic scenes into stories: a story is a group of scenes not necessarily contiguous but showing a strong semantic relation. Each scene is described using three different modalities (based on color histograms, speaker diarization or automatic speech recognition outputs) as well as their multimodal combination. We introduce the notion of character-driven episodes as episodes where stories are emphasized by the presence or absence of characters, and we propose an automatic method, based on a social graph, to detect these episodes. Depending on whether an episode is character-driven or not, the plot-de-interlacing -which is a scene clustering- is made either through a traditional average-link agglomerative clustering with speaker modality only, either through a spectral clustering with the fusion of all modalities. Experiments, conducted on twenty three episodes from three quite different TV series (different lengths and formats), show that the hierarchical framework brings an improvement for all the series.