Robust several-speaker speech recognition with highly dependable online speaker adaptation and identification

Authors:
Po-Yi Shih;Po-Chuan Lin;Jhing-Fa Wang;Yuan-Ning Lin
Affiliations:
Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan;Department of Electronics Engineering and Computer Science, Tung Fang Institute of Technology, Kaohsiung, Taiwan;Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan;Department of Electrical Engineering, National Cheng Kung University, Tainan, Taiwan
Venue:
Journal of Network and Computer Applications
Year:
2011

Citing 9
Cited 2

The NIST speaker recognition evaluation - overview methodology, systems, results, perspective

Speech Communication - Speaker recognition and its commercial and forensic applications
Discriminative utterance verification by integrating multiple confidence measures: a unified training and testing approach

Discriminative utterance verification by integrating multiple confidence measures: a unified training and testing approach
Content-Based Audio Classification Using Support Vector Machines and Independent Component Analysis

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 04
Discriminative utterance verification using minimum string verification error (MSVE) training

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 06
Critical Band Subspace-Based Speech Enhancement Using SNR and Auditory Masking Aware Technique

IEICE - Transactions on Information and Systems
Supervised and unsupervised speaker adaptation in large vocabulary continuous speech recognition of czech

TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
Applications of support vector machines to speech recognition

IEEE Transactions on Signal Processing
Kernel Eigenspace-Based MLLR Adaptation

IEEE Transactions on Audio, Speech, and Language Processing
Speaker identification for security systems using reinforcement-trained pRAM neural network architectures

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Steps in the development of a robotic scrub nurse

Robotics and Autonomous Systems
Using incremental subspace and contour template for object tracking

Journal of Network and Computer Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

The currently adaptive mechanisms adapt a single acoustic model for a speaker in speaker-independent speech recognition system. However, as more users use the same speech recognizer, single acoustic model adaptation leads to negative adaptation upon switching between users. Such a situation is problematic (undependable adaptation). This paper, considering the situation of a smart home or an office with staff members, presents the speaker-specific acoustic model adaptation based on a multi-model mechanism, to solve the problem of undependable adaptation. First, the identification of the current speaker is confirmed using the SVM classifier, then the corresponding acoustic parameters are extracted and integrated with the speaker-independent acoustic model to yield the speaker-dependent acoustic model and speech recognition accuracy then be promoted for the current speaker. To provide dependable adaptation data to achieve online positive speaker adaptation, a mechanism that measures confidence score is designed to verify each recognition result and determined whether it can be an adaptation datum. The experimental results indicate that the proposed system can effectively increase the average speech recognition accuracy from 62% to 85%. Thus, the proposed system can achieve robust several-speaker speech recognition with highly dependable online speaker adaptation and identification.