Speaker identification and verification using Gaussian mixture speaker models
Speech Communication
ICSC '08 Proceedings of the 2008 IEEE International Conference on Semantic Computing
Robust speaker segmentation for meetings: the ICSI-SRI spring 2005 diarization system
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Joke-o-mat: browsing sitcoms punchline by punchline
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Darwin phones: the evolution of sensing and inference on mobile phones
Proceedings of the 8th international conference on Mobile systems, applications, and services
Joke-o-Mat HD: browsing sitcoms with human derived transcripts
Proceedings of the international conference on Multimedia
Tuning-robust initialization methods for speaker diarization
IEEE Transactions on Audio, Speech, and Language Processing
Effect of noise-in-speech on MFCC parameters
SSIP '09/MIV'09 Proceedings of the 9th WSEAS international conference on signal, speech and image processing, and 9th WSEAS international conference on Multimedia, internet & video technologies
International Journal of Multimedia Data Engineering & Management
Narrative theme navigation for sitcoms supported by fan-generated scripts
Multimedia Tools and Applications
Hi-index | 0.00 |
The following article describes our technical demonstration of an online speaker identification system for conversations. A laptop with an internal microphone is centrally placed in the table of a meeting room. The system is able to identify the current speaker independent of spoken text or language with a latency of about 1.5 seconds and an accuracy of about 85% (as evaluated against the NIST RT benchmark). A Java GUI shows the image of the current speaker along with a timeline containing past speakers. Speakers are added to the system's database using a one-minute training procedure.