Dialocalization: Acoustic speaker diarization and visual localization as joint optimization problem
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
ICMI '11 Proceedings of the 13th international conference on multimodal interfaces
A real-time speech enhancement framework for multi-party meetings
NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
Embodied cooperation using mobile devices: presenting and evaluating the Sync4All application
Proceedings of the International Working Conference on Advanced Visual Interfaces
Dominance detection in a reverberated acoustic scenario
ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
Multimodal prediction of expertise and leadership in learning groups
Proceedings of the 1st International Workshop on Multimodal Learning Analytics
SocioPhone: everyday face-to-face interaction monitoring platform using multi-phone sensor fusion
Proceeding of the 11th annual international conference on Mobile systems, applications, and services
TalkBetter: family-driven mobile intervention care for children with language delay
Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing
Hi-index | 0.00 |
With the increase in cheap commercially available sensors, recording meetings is becoming an increasingly practical option. With this trend comes the need to summarize the recorded data in semantically meaningful ways. Here, we investigate the task of automatically measuring dominance in small group meetings when only a single audio source is available. Past research has found that speaking length as a single feature, provides a very good estimate of dominance. For these tasks we use speaker segmentations generated by our automated faster than real-time speaker diarization algorithm, where the number of speakers is not known beforehand. From user-annotated data, we analyze how the inherent variability of the annotations affects the performance of our dominance estimation method. We primarily focus on examining of how the performance of the speaker diarization and our dominance tasks vary under different experimental conditions and computationally efficient strategies, and how this would impact on a practical implementation of such a system. Despite the use of a state-of-the-art speaker diarization algorithm, speaker segments can be noisy. On conducting experiments on almost 5 hours of audio-visual meeting data, our results show that the dominance estimation is robust to increasing diarization noise.