Speaker identification and verification using Gaussian mixture speaker models
Speech Communication
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The Journal of Machine Learning Research
Efficient subword lattice retrieval for German spoken term detection
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
An overview of text-independent speaker recognition: From features to supervectors
Speech Communication
Learning author-topic models from text corpora
ACM Transactions on Information Systems (TOIS)
Topic models for image annotation and text illustration
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Hi-index | 0.00 |
We investigate how a speaker's preference for specific topics can be used for speaker identification. In domains like broadcast news or parliamentary speeches, speakers have a field of expertise they are associated with. We explore how topic information for a segment of speech, extracted from an automatic speech recognition transcript, can be employed to identify the speaker. Two methods for modelling topic preferences are compared: implicitly, based on speaker-characteristic keywords, and explicitly, by using automatically derived topic models to assign topics to the speech segments. In the keyword-based approach, the segments' tf-idf vectors are classified with Support Vector Machine speaker models. For the topic-model-based approach, a domain-specific topic model is used to represent each segment as a mixture of topics; the speakers' score is derived from the Kullback-Leibler divergence between the topic mixtures of their training data and of the segment. The methods were tested on political speeches given in German parliament by 235 politicians. We found that topic cues do carry speaker information, as the topic-model-based system yielded an equal error rate (EER) of 16.3%. The topic-based approach combined well with a spectral baseline system, improving the EER from 8.6% for the spectral to 6.2% for the fused system.