Elements of information theory
Elements of information theory
Unsupervised document classification using sequential information maximization
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Unsupervised Image Clustering Using the Information Bottleneck Method
Proceedings of the 24th DAGM Symposium on Pattern Recognition
Evolutive HMM for multi-speaker tracking system
ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Speaker Diarization For Multiple-Distant-Microphone Meetings Using Several Sources of Information
IEEE Transactions on Computers
The rich transcription 2006 spring meeting recognition evaluation
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Multistage speaker diarization of broadcast news
IEEE Transactions on Audio, Speech, and Language Processing
Proceedings of the 2nd international workshop on Social signal processing
Multistream speaker diarization of meetings recordings beyond MFCC and TDOA features
Speech Communication
Real-time audio-visual analysis for multiperson videoconferencing
Advances in Multimedia
Hi-index | 0.00 |
A speaker diarization system based on an information theoretic framework is described. The problem is formulated according to the Information Bottleneck (IB) principle. Unlike other approaches where the distance between speaker segments is arbitrarily introduced, the IB method seeks the partition that maximizes the mutual information between observations and variables relevant for the problem while minimizing the distortion between observations. This solves the problem of choosing the distance between speech segments, which becomes the Jensen-Shannon divergence as it arises from the IB objective function optimization. We discuss issues related to speaker diarization using this information theoretic framework such as the criteria for inferring the number of speakers, the tradeoff between quality and compression achieved by the diarization system, and the algorithms for optimizing the objective function. Furthermore, we benchmark the proposed system against a state-of-the-art system on the NIST RT06 (Rich Transcription) data set for speaker diarization of meetings. The IB-based system achieves a diarization error rate of 23.2% compared to 23.6% for the baseline system. This approach being mainly based on nonparametric clustering, it runs significantly faster than the baseline HMM/GMM based system, resulting in faster-than-real-time diarization.