Unsupervised speaker segmentation with residual phase and MFCC features

Authors:
S. Jothilakshmi;V. Ramalingam;S. Palanivel
Affiliations:
Department of Computer Science and Engineering, Annamalai University, Annamalainagar, Tamilnadu 608 002, India;Department of Computer Science and Engineering, Annamalai University, Annamalainagar, Tamilnadu 608 002, India;Department of Computer Science and Engineering, Annamalai University, Annamalainagar, Tamilnadu 608 002, India
Venue:
Expert Systems with Applications: An International Journal
Year:
2009

Citing 8
Cited 2

Fundamentals of speech recognition

Fundamentals of speech recognition
The LIMSI Broadcast News transcription system

Speech Communication - Special issue on automatic transcription of broadcast news data
Speaker change detection and tracking in real-time news broadcasting analysis

Proceedings of the tenth ACM international conference on Multimedia
A speaker tracking system based on speaker turn detection for NIST evaluation

ICASSP '00 Proceedings of the Acoustics, Speech, and Signal Processing, 2000. on IEEE International Conference - Volume 02
Unsupervised Speaker Change Detection Using SVM Training Misclassification Rate

IEEE Transactions on Computers
Facial expression recognition - A real time approach

Expert Systems with Applications: An International Journal
Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation

IEEE Transactions on Audio, Speech, and Language Processing
Content-based audio classification and retrieval by support vector machines

IEEE Transactions on Neural Networks

Classification of speech dysfluencies with MFCC and LPCC features

Expert Systems with Applications: An International Journal
Investigation of broadcast-audio semantic analysis scenarios employing radio-programme-adaptive pattern classification

Speech Communication

Quantified Score

Hi-index	12.05

Visualization

Abstract

This paper proposes an unsupervised method for improving the automatic speaker segmentation performance by combining the evidence from residual phase (RP) and mel frequency cepstral coefficients (MFCC). This method demonstrates the complementary nature of speaker specific information present in the residual phase in comparison with the information present in the conventional MFCC. Moreover this method presents an unsupervised speaker segmentation algorithm based on support vector machine (SVM). The experiments show that the combination of residual phase and MFCC helps to identify more robustly the transitions among speakers.