Audio Data Indexing: Use of Second-Order Statistics for Speaker-Based Segmentation

Authors:
Perrine Delacourt;Christian Wellekens
Affiliations:
Institut EURECOM;Institut EURECOM
Venue:
ICMCS '99 Proceedings of the 1999 IEEE International Conference on Multimedia Computing and Systems - Volume 02
Year:
1999

Citing 0
Cited 2

Scene Change Detection Based on Audio-Visual Analysis and Interaction

Proceedings of the 10th International Workshop on Theoretical Foundations of Computer Vision: Multi-Image Analysis
Speaker Clustering Aided by Visual Dialogue Analysis

PCM '08 Proceedings of the 9th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The content-based indexing task considered in this paper consists in recognizing from their voice, speakers involved in a conversation. A new approach for speaker-based segmentation, which is the first necessary step for this indexing task, is described. Our study is done under the assumptions that no prior information on speakers is available, that the number of speakers is unknown and that people do not speak simultaneously. Audio data indexing is commonly divided in two parts : audio data is first segmented with respect to speakers utterances and resulting segments associated with a given speaker are merged together. In this work, we focus on the first part and we propose a new segmentation method based on second order statistics. The practical significance of this study is illustrated by applying our new technique to real data to show its efficiency.