Speaker diarization exploiting the eigengap criterion and cluster ensembles

Authors:
Nikoletta Bassiou;Vassiliki Moschou;Constantine Kotropoulos
Affiliations:
Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece;Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece;Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
Venue:
IEEE Transactions on Audio, Speech, and Language Processing
Year:
2010

Citing 19
Cited 1

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
DISTBIC: a speaker-based segmentation for audio data indexing

Speech Communication - Special issue on accessing information in spoken audio
Finding Consistent Clusters in Data Partitions

MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
Bagging for Path-Based Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Cluster ensemble and its applications in gene expression analysis

APBC '04 Proceedings of the second conference on Asia-Pacific bioinformatics - Volume 29
Combining Multiple Clusterings Using Evidence Accumulation

IEEE Transactions on Pattern Analysis and Machine Intelligence
MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval

MPEG-7 Audio and Beyond: Audio Content Indexing and Retrieval
Segregation of speakers for speech recognition and speaker identification

ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
A tutorial on spectral clustering

Statistics and Computing
Review: Speaker segmentation and clustering

Signal Processing
The ICSI RT07s Speaker Diarization System

Multimodal Technologies for Perception of Humans
Speaker diarization using autoassociative neural networks

Engineering Applications of Artificial Intelligence
Robust detection of phone boundaries using model selection criteria with few observations

IEEE Transactions on Audio, Speech, and Language Processing
Computationally Efficient and Robust BIC-Based Speaker Segmentation

IEEE Transactions on Audio, Speech, and Language Processing
Multistage speaker diarization of broadcast news

IEEE Transactions on Audio, Speech, and Language Processing
Progress in the CU-HTK broadcast news transcription system

IEEE Transactions on Audio, Speech, and Language Processing
An overview of automatic speaker diarization systems

IEEE Transactions on Audio, Speech, and Language Processing
Unsupervised speaker recognition based on competition between self-organizing maps

IEEE Transactions on Neural Networks

A spectral clustering approach to application-specific network-on-chip synthesis

Proceedings of the Conference on Design, Automation and Test in Europe

Quantified Score

Hi-index	0.00

Visualization

Abstract

A novel system for speaker diarization is proposed that combines the eigengap criterion and cluster ensembles. No explicit assumptions on the number of speakers are made. Two variants of the system are developed. The first variant does not cluster the speech segments that are detected as outliers, while the second one does. The aforementioned system variants are assessed with respect to various metrics, such as the overall classification error, the average cluster purity, and the average speaker purity. Experiments are conducted on two-person dialogue scenes in movies as well as on news broadcasts from MDERT-03 Training Data Speech Corpus released by the U.S. National Institute of Standards and Technology. In the latter case, the diarization error rate is also reported. It is demonstrated that the clustering performance does not degrade when outliers are present. Moreover, thanks to the eigengap criterion, the evaluation metrics are improved.