FPGA implementation for GMM-based speaker identification

Authors:
Phaklen EhKan;Timothy Allen;Steven F. Quigley
Affiliations:
School of Electronic, Electrical and Computer Engineering, University of Birmingham, Edgbaston, Birmingham, UK and School of Computer and Communication Engineering, University Malaysia Perlis, Per ...;School of Electronic, Electrical and Computer Engineering, University of Birmingham, Edgbaston, Birmingham, UK;School of Electronic, Electrical and Computer Engineering, University of Birmingham, Edgbaston, Birmingham, UK
Venue:
International Journal of Reconfigurable Computing - Special issue on selected papers from the southern programmable logic conference (SPL2010)
Year:
2011

Citing 5
Cited 1

Speech Synthesis and Recognition

Speech Synthesis and Recognition
Speech Recognition on an FPGA Using Discrete and Continuous Hidden Markov Models

FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
Implementing a Simple Continuous Speech Recognition System on an FPGA

FCCM '02 Proceedings of the 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
An efficient digital VLSI implementation of Gaussian mixture models-based classifier

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A multi-fpga 10x-real-time high-speed search engine for a 5000-word vocabulary speech recognizer

Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays

Real-Time Speaker Verification System Implemented on Reconfigurable Hardware

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In today's society, highly accurate personal identification systems are required. Passwords or pin numbers can be forgotten or forged and are no longer considered to offer a high level of security. The use of biological features, biometrics, is becoming widely accepted as the next level for security systems. Biometric-based speaker identification is a method of identifying persons from their voice. Speaker-specific characteristics exist in speech signals due to different speakers having different resonances of the vocal tract. These differences can be exploited by extracting feature vectors such as Mel-Frequency Cepstral Coefficients (MFCCs) from the speech signal. A well-known statistical modelling process, the Gaussian Mixture Model (GMM), then models the distribution of each speaker's MFCCs in a multidimensional acoustic space. The GMM-based speaker identification system has features that make it promising for hardware acceleration. This paper describes the hardware implementation for classification of a text-independent GMM-based speaker identification system. The aim was to produce a system that can perform simultaneous identification of large numbers of voice streams in real time. This has important potential applications in security and in automated call centre applications. A speedup factor of ninety was achieved compared to a software implementation on a standard PC.