Statistical methods for speech recognition
Statistical methods for speech recognition
Multi-Modal Dialog Scene Detection Using Hidden Markov Models for Content-Based Multimedia Indexing
Multimedia Tools and Applications
Audio-visual synchrony for detection of monologues in video archives
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Dialogue sequence detection in movies
CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
Hi-index | 0.00 |
In this paper, we investigate a novel framework for dialogue detection that is based on indicator functions. An indicator function defines that a particular actor is present at each time instant. Two dialogue detection rules are developed and assessed. The first rule relies on the value of the cross-correlation function at zero time lag that is compared to a threshold. The second rule is based on the cross-power in a particular frequency band that is also compared to a threshold. Experiments are carried out in order to validate the feasibility of the aforementioned dialogue detection rules by using ground-truth indicator functions determined by human observers from six different movies. A total of 25 dialogue scenes and another 8 non-dialogue scenes are employed. The probabilities of false alarm and detection are estimated by cross-validation, where 70% of the available scenes are used to learn the thresholds employed in the dialogue detection rules and the remaining 30% of the scenes are used for testing. An almost perfect dialogue detection is reported for every distinct threshold.