UBM based speaker segmentation and clustering for 2-speaker detection

Authors:
Jing Deng;Thomas Fang Zheng;Wenhu Wu
Affiliations:
Center for Speech Technology, Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing;Center for Speech Technology, Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing;Center for Speech Technology, Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing
Venue:
ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Year:
2006

Citing 3
Cited 0

DISTBIC: a speaker-based segmentation for audio data indexing

Speech Communication - Special issue on accessing information in spoken audio
Stochastic Complexity in Statistical Inquiry Theory

Stochastic Complexity in Statistical Inquiry Theory
Segregation of speakers for speech recognition and speaker identification

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, a speaker segmentation method based on log-likelihood ratio score (LLRS) over universal background model (UBM) and a speaker clustering method based on difference of log-likelihood scores between two speaker models are proposed. During the segmentation process, the LLRS between two adjacent speech segments over UBM is used as a distance measure Cwhile during the clustering process Cthe difference of log-likelihood scores between two speaker models is used as a speaker classification criterion. A complete system for NIST 2002 2-speaker task is presented using the methods mentioned above. Experimental results on NIST 2002 Switchboard Cellular speaker segmentation corpus, 1-speaker evaluation corpus and 2- speaker evaluation corpus show the potentiality of the proposed algorithms.