A GMM-based speaker identification system on FPGA

Authors:
Phak Len Eh Kan;Tim Allen;Steven F. Quigley
Affiliations:
School of Electronic, Electrical and Computer Engineering, University of Birmingham, Edgbaston, Birmingham, United Kingdom;School of Electronic, Electrical and Computer Engineering, University of Birmingham, Edgbaston, Birmingham, United Kingdom;School of Electronic, Electrical and Computer Engineering, University of Birmingham, Edgbaston, Birmingham, United Kingdom
Venue:
ARC'10 Proceedings of the 6th international conference on Reconfigurable Computing: architectures, Tools and Applications
Year:
2010

Citing 4
Cited 0

Speech Synthesis and Recognition

Speech Synthesis and Recognition
Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models

FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
Implementing a Simple Continuous Speech Recognition System on an FPGA

FCCM '02 Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
A 1000-word vocabulary, speaker-independent, continuous live-mode speech recognizer implemented in a single FPGA

Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays

Quantified Score

Hi-index	0.00

Visualization

Abstract

Speaker identification is the process of identifying persons from their voice. Speaker-specific characteristics exist in speech signals due to different speakers having different resonances of the vocal tract and these can be exploited by extracting feature vectors such as Mel frequency cepstral coefficients (MFCCs) from the speech signal. The Gaussian Mixture Model (GMM) as a well-known statistical model then models the distribution of each speaker’s MFCCs in a multidimensional acoustic space. The GMM-based speaker identification system has features that make it promising for hardware acceleration. This paper describes the classification hardware implementation of a text-independent GMM-based speaker identification system. A speed factor of 90 was achieved compared to software-based implementation on a standard PC.