Speaker Diarization For Multiple-Distant-Microphone Meetings Using Several Sources of Information
IEEE Transactions on Computers
Review: Speaker segmentation and clustering
Signal Processing
Speaker diarization using one-class support vector machines
Speech Communication
The application of hidden Markov models in speech recognition
Foundations and Trends in Signal Processing
Invited paper: Automatic speech recognition: History, methods and challenges
Pattern Recognition
Evolutionary minimization of the Rand index for speaker clustering
Computer Speech and Language
Social signal processing: state-of-the-art and future perspectives of an emerging domain
MM '08 Proceedings of the 16th ACM international conference on Multimedia
Disclosing spoken culture: user interfaces for access to spoken word archives
BCS-HCI '08 Proceedings of the 22nd British HCI Group Annual Conference on People and Computers: Culture, Creativity, Interaction - Volume 1
Speaker diarization using autoassociative neural networks
Engineering Applications of Artificial Intelligence
An Adaptive BIC Approach for Robust Speaker Change Detection in Continuous Audio Streams
TSD '09 Proceedings of the 12th International Conference on Text, Speech and Dialogue
Fusion of Acoustic and Prosodic Features for Speaker Clustering
TSD '09 Proceedings of the 12th International Conference on Text, Speech and Dialogue
Spherical discriminant analysis in semi-supervised speaker clustering
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Social signal processing: Survey of an emerging domain
Image and Vision Computing
Unfolding speaker clustering potential: a biomimetic approach
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Investigating the use of visual focus of attention for audio-visual speaker diarisation
MM '09 Proceedings of the 17th ACM international conference on Multimedia
An overview of text-independent speaker recognition: From features to supervectors
Speech Communication
A speaker diarization method based on the probabilistic fusion of audio-visual location information
Proceedings of the 2009 international conference on Multimodal interfaces
IEEE Transactions on Audio, Speech, and Language Processing
Locality preserving speaker clustering
ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Identification of Soundbite and Its Speaker Name Using Transcripts of Broadcast News Speech
ACM Transactions on Asian Language Information Processing (TALIP)
IEEE Transactions on Audio, Speech, and Language Processing
Mobile social signal processing: vision and research issues
Proceedings of the 12th international conference on Human computer interaction with mobile devices and services
IEEE Transactions on Audio, Speech, and Language Processing
Proceedings of the 2nd international workshop on Social signal processing
Multichannel system of audio-visual support of remote mobile participant at e-meeting
ruSMART/NEW2AN'10 Proceedings of the Third conference on Smart Spaces and next generation wired, and 10th international conference on Wireless networking
Speaker diarization exploiting the eigengap criterion and cluster ensembles
IEEE Transactions on Audio, Speech, and Language Processing
Logistic Stick-Breaking Process
The Journal of Machine Learning Research
On the use of dot scoring for speaker diarization
IbPRIA'11 Proceedings of the 5th Iberian conference on Pattern recognition and image analysis
Multistream speaker diarization of meetings recordings beyond MFCC and TDOA features
Speech Communication
A comparison of latent variable models for conversation analysis
SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference
Comparison of segmentation and clustering methods for speaker diarization of broadcast stream audio
COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
Variational conditional random fields for online speaker detection and tracking
Speech Communication
The nonverbal structure of patient case discussions in multidisciplinary medical team meetings
ACM Transactions on Information Systems (TOIS)
A review on speaker diarization systems and approaches
Speech Communication
Spoken Content Retrieval: A Survey of Techniques and Technologies
Foundations and Trends in Information Retrieval
Eigenvoice modelling for cross likelihood ratio based speaker clustering: A Bayesian approach
Computer Speech and Language
Bayesian nonparametric hidden semi-Markov models
The Journal of Machine Learning Research
Hi-index | 0.00 |
Audio diarization is the process of annotating an input audio channel with information that attributes (possibly overlapping) temporal regions of signal energy to their specific sources. These sources can include particular speakers, music, background noise sources, and other signal source/channel characteristics. Diarization can be used for helping speech recognition, facilitating the searching and indexing of audio archives, and increasing the richness of automatic transcriptions, making them more readable. In this paper, we provide an overview of the approaches currently used in a key area of audio diarization, namely speaker diarization, and discuss their relative merits and limitations. Performances using the different techniques are compared within the framework of the speaker diarization task in the DARPA EARS Rich Transcription evaluations. We also look at how the techniques are being introduced into real broadcast news systems and their portability to other domains and tasks such as meetings and speaker verification