Algorithms for clustering data
Algorithms for clustering data
Detection of abrupt changes: theory and application
Detection of abrupt changes: theory and application
Factorial Hidden Markov Models
Machine Learning - Special issue on learning with probabilistic representations
Automatic segmentation of speech recorded in unknown noisy channel characteristics
Speech Communication - Special issue on robust speech recognition
Robust speech recognition using the modulation spectrogram
Speech Communication - Special issue on robust speech recognition
A view of the EM algorithm that justifies incremental, sparse, and other variants
Proceedings of the NATO Advanced Study Institute on Learning in graphical models
DISTBIC: a speaker-based segmentation for audio data indexing
Speech Communication - Special issue on accessing information in spoken audio
Automatic transcription of Broadcast News
Speech Communication - Special issue on automatic transcription of broadcast news data
Speaker change detection and tracking in real-time news broadcasting analysis
Proceedings of the tenth ACM international conference on Multimedia
Broadband Beamforming with Adaptive Postfiltering for Speech Acquisition in Noisy Environments
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 1 - Volume 1
UBM-based incremental speaker adaptation
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 1
Audio Segmentation and Speaker Localization in Meeting Videos
ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
Cross-modal prediction in audio-visual communication
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 04
Speaker Diarization For Multiple-Distant-Microphone Meetings Using Several Sources of Information
IEEE Transactions on Computers
On-line multi-modal speaker diarization
Proceedings of the 9th international conference on Multimodal interfaces
Review: Speaker segmentation and clustering
Signal Processing
A Decision-Tree-Based Online Speaker Clustering
IbPRIA '07 Proceedings of the 3rd Iberian conference on Pattern Recognition and Image Analysis, Part I
Progress in the AMIDA Speaker Diarization System for Meeting Data
Multimodal Technologies for Perception of Humans
Multi-stage Speaker Diarization for Conference and Lecture Meetings
Multimodal Technologies for Perception of Humans
Speaker diarization using unsupervised discriminant analysis of inter-channel delay features
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Speaker diarization in meeting audio
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Fusing short term and long term features for improved speaker diarization
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Effective metric-based speaker segmentation in the frequency domain
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Cluster criterion functions in spectral subspace and their application in speaker clustering
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Fishervoice and semi-supervised speaker clustering
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Online speaker clustering using incremental learning of an ergodic hidden Markov model
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Improved speaker diarization system for meetings
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
An overview of text-independent speaker recognition: From features to supervectors
Speech Communication
Audio segmentation for meetings speech processing
Audio segmentation for meetings speech processing
Speaker localisation using audio-visual synchrony: an empirical study
CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
Annotation of heterogeneous multimedia content using automatic speech recognition
SAMT'07 Proceedings of the semantic and digital media technologies 2nd international conference on Semantic Multimedia
Tuning-robust initialization methods for speaker diarization
IEEE Transactions on Audio, Speech, and Language Processing
The rich transcription 2005 spring meeting recognition evaluation
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Robust speaker segmentation for meetings: the ICSI-SRI spring 2005 diarization system
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
The TNO speaker diarization system for NIST RT05s meeting data
MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction
Speaker diarization for multi-microphone meetings using only between-channel differences
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Technical improvements of the E-HMM based speaker diarization system for meeting records
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Speaker diarization: from broadcast news to lectures
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
IEEE Transactions on Signal Processing
Prosodic and other Long-Term Features for Speaker Diarization
IEEE Transactions on Audio, Speech, and Language Processing
Computationally Efficient and Robust BIC-Based Speaker Segmentation
IEEE Transactions on Audio, Speech, and Language Processing
Multiple change-point audio segmentation and classification using an MDL-based Gaussian model
IEEE Transactions on Audio, Speech, and Language Processing
Multistage speaker diarization of broadcast news
IEEE Transactions on Audio, Speech, and Language Processing
Progress in the CU-HTK broadcast news transcription system
IEEE Transactions on Audio, Speech, and Language Processing
An overview of automatic speaker diarization systems
IEEE Transactions on Audio, Speech, and Language Processing
Speaker association with signal-level audiovisual fusion
IEEE Transactions on Multimedia
Unified fusion rules for multisensor multihypothesis network decision systems
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Hi-index | 0.00 |
Speaker indexing or diarization is an important task in audio processing and retrieval. Speaker diarization is the process of labeling a speech signal with labels corresponding to the identity of speakers. This paper includes a comprehensive review on the evolution of the technology and different approaches in speaker indexing and tries to offer a fully detailed discussion on these approaches and their contributions. This paper reviews the most common features for speaker diarization in addition to the most important approaches for speech activity detection (SAD) in diarization frameworks. Two main tasks of speaker indexing are speaker segmentation and speaker clustering. This paper includes a separate review on the approaches proposed for these subtasks. However, speaker diarization systems which combine the two tasks in a unified framework are also introduced in this paper. Another discussion concerns the approaches for online speaker indexing which has fundamental differences with traditional offline approaches. Other parts of this paper include an introduction on the most common performance measures and evaluation datasets. To conclude this paper, a complete framework for speaker indexing is proposed, which is aimed to be domain independent and parameter free and applicable for both online and offline applications.