Speaker identification and verification using Gaussian mixture speaker models
Speech Communication
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
The art of computer programming, volume 2 (3rd ed.): seminumerical algorithms
Markov processes on curves for automatic speech recognition
Proceedings of the 1998 conference on Advances in neural information processing systems II
Machine Learning
Discovering Similar Multidimensional Trajectories
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Artificial General Intelligence (Cognitive Technologies)
Artificial General Intelligence (Cognitive Technologies)
Modeling prosodic differences for speaker recognition
Speech Communication
Springer Handbook of Speech Processing
Springer Handbook of Speech Processing
Review: Speaker segmentation and clustering
Signal Processing
Speaker diarization using one-class support vector machines
Speech Communication
In search of deterministic methods for initializing K-means and Gaussian mixture clustering
Intelligent Data Analysis
α-Gaussian mixture modelling for speaker recognition
Pattern Recognition Letters
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
An online kernel change detection algorithm
IEEE Transactions on Signal Processing - Part II
Temporal Integration for Audio Classification With Application to Musical Instrument Classification
IEEE Transactions on Audio, Speech, and Language Processing
Computationally Efficient and Robust BIC-Based Speaker Segmentation
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Audio, Speech, and Language Processing
IEEE Transactions on Audio, Speech, and Language Processing
An overview of automatic speaker diarization systems
IEEE Transactions on Audio, Speech, and Language Processing
Dynamic captioning: video accessibility enhancement for hearing impairment
Proceedings of the international conference on Multimedia
Video accessibility enhancement for hearing-impaired users
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) - Special section on ACM multimedia 2010 best paper candidates, and issue on social media
Hi-index | 0.00 |
Speaker clustering is the task of grouping a set of speech utterances into speaker-specific classes. The basic techniques for solving this task are similar to those used for speaker verification and identification. The hypothesis of this paper is that the techniques originally developed for speaker verification and identification are not sufficiently discriminative for speaker clustering. However, the processing chain for speaker clustering is quite large - there are many potential areas for improvement. The question is: where should improvements be made to improve the final result? To answer this question, this paper takes a biomimetic approach based on a study with human participants acting as an automatic speaker clustering system. Our findings are twofold: it is the stage of modeling that has the highest potential, and information with respect to the temporal succession of frames is crucially missing. Experimental results with our implementation of a speaker clustering system incorporating our findings and applying it on TIMIT data show the validity of our approach.