BIC-based speaker segmentation using divide-and-conquer strategies with application to speaker diarization

Authors:
Shih-Sian Cheng;Hsin-Min Wang;Hsin-Chia Fu
Affiliations:
Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan and Institute of Information Science, Academia Sinica, Taipei, Taiwan;Institute of Information Science, Academia Sinica, Taipei, Taiwan;Department of Computer Science, National Chiao Tung University, Hsinchu, Taiwan
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2010

Citing 13
Cited 1

DISTBIC: a speaker-based segmentation for audio data indexing

Speech Communication - Special issue on accessing information in spoken audio
Introduction to Algorithms: A Creative Approach

Introduction to Algorithms: A Creative Approach
Speaker change detection and tracking in real-time news broadcasting analysis

Proceedings of the tenth ACM international conference on Multimedia
Segregation of speakers for speech recognition and speaker identification

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
A speaker tracking system based on speaker turn detection for NIST evaluation

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Speaker Diarization For Multiple-Distant-Microphone Meetings Using Several Sources of Information

IEEE Transactions on Computers
Unsupervised Speaker Change Detection Using SVM Training Misclassification Rate

IEEE Transactions on Computers
Advances in unsupervised audio classification and segmentation for the broadcast news and NGSW corpora

IEEE Transactions on Audio, Speech, and Language Processing
Multiple change-point audio segmentation and classification using an MDL-based Gaussian model

IEEE Transactions on Audio, Speech, and Language Processing
Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation

IEEE Transactions on Audio, Speech, and Language Processing
Efficient Speaker Change Detection Using Adapted Gaussian Mixture Models

IEEE Transactions on Audio, Speech, and Language Processing
Multistage speaker diarization of broadcast news

IEEE Transactions on Audio, Speech, and Language Processing
An overview of automatic speaker diarization systems

IEEE Transactions on Audio, Speech, and Language Processing

CONTENTUS--technologies for next generation multimedia libraries

Multimedia Tools and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose three divide-and-conquer approaches for Bayesian information criterion (BIC)-based speaker segmentation. The approaches detect speaker changes by recursively partitioning a large analysis window into two sub-windows and recursively verifying the merging of two adjacent audio segments using Δ BIC, a widely-adopted distance measure of two audio segments.We compare our approaches to three popular distance-based approaches, namely, Chen and Gopalakrishnan's window-growing-based approach, Siegler et al.'s fixed-size sliding window approach, and Delacourt and Wellekens's DISTBIC approach, by performing computational cost analysis and conducting speaker change detection experiments on two broadcast news data sets. The results show that the proposed approaches are more efficient and achieve higher segmentation accuracy than the compared distance-based approaches. In addition, we apply the segmentation approaches discussed in this paper to the speaker diarization task. The experiment results show that a more effective segmentation approach leads to better diarization accuracy.