An overview of text-independent speaker recognition: From features to supervectors

Authors:
Tomi Kinnunen;Haizhou Li
Affiliations:
Department of Computer Science and Statistics, Speech and Image Processing Unit, University of Joensuu, P.O. Box 111, 80101 Joensuu, Finland;Department of Human Language Technology, Institute for Infocomm Research (I2R), 1 Fusionopolis Way, #21-01 Connexis, South Tower, Singapore 138632, Singapore
Venue:
Speech Communication
Year:
2010

Citing 58
Cited 40

Vector quantization and signal compression

Vector quantization and signal compression
Fundamentals of speech recognition

Fundamentals of speech recognition
Speaker identification and verification using Gaussian mixture speaker models

Speech Communication
Usefulness of the LPC-residue in text-independent speaker verification

Speech Communication
Second-order statistical measures for text-independent speaker identification

Speech Communication
A study of harmonic features for the speaker recognition

Speech Communication
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Discrete-time signal processing (2nd ed.)

Discrete-time signal processing (2nd ed.)
Should recognizers have ears?

Speech Communication - Special issue on robust speech recognition
Cepstral domain segmental feature vector normalization for noise robust speech recognition

Speech Communication - Special issue on robust speech recognition
Joint estimation of feature transformation parameters and Gaussian mixture model for speaker identification

Speech Communication
Statistical Pattern Recognition: A Review

IEEE Transactions on Pattern Analysis and Machine Intelligence
Localization and selection of speaker-specific information with statistical modeling

Speech Communication - Speaker recognition and its commercial and forensic applications
Robustness to telephone handset distortion in speaker recognition by discriminative feature design

Speech Communication - Speaker recognition and its commercial and forensic applications
Subband architecture for automatic speaker recognition

Signal Processing - Special issue on emerging techniques for communication terminals
Spoken Language Processing: A Guide to Theory, Algorithm, and System Development

Spoken Language Processing: A Guide to Theory, Algorithm, and System Development
AANN: an alternative to GMM for pattern recognition

Neural Networks
Speaker-specific mapping for text-independent speaker recognition

Speech Communication
Handset-Dependent Background Models for Robust Text-Independent Speaker Recognition

ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
Improving speaker identification in noise by subband processing and decision fusion

Pattern Recognition Letters - Special issue: Audio- and video-based biometric person authentication (AVBPA 2001)
Vector Quantization Based Gaussian Modeling for Speaker Verification

ICPR '00 Proceedings of the International Conference on Pattern Recognition - Volume 3
Speaker verification using speaker- and test-dependent fast score normalization

Pattern Recognition Letters
Modeling prosodic differences for speaker recognition

Speech Communication
An experimental study of speaker verification sensitivity to computer voice-altered imposters

ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 02
Joint acoustic and modulation frequency

EURASIP Journal on Applied Signal Processing
Accuracy of MFCC-based speaker recognition in series 60 device

EURASIP Journal on Applied Signal Processing
Explicit modelling of session variability for speaker verification

Computer Speech and Language
Reliability-based decision fusion in multimodal biometric verification systems

EURASIP Journal on Applied Signal Processing
A tutorial on text-independent speaker verification

EURASIP Journal on Applied Signal Processing
An investigation of dependencies between frequency components and speaker characteristics for text-independent speaker identification

Speech Communication
Text-independent speaker recognition using graph matching

Pattern Recognition Letters
Extraction and representation of prosodic features for language and speaker recognition

Speech Communication
Pitch Synchronous Based Feature Extraction for Noise-Robust Speaker Verification

CISP '08 Proceedings of the 2008 Congress on Image and Signal Processing, Vol. 5 - Volume 05
Investigation on LP-residual representations for speaker identification

Pattern Recognition
Comparative evaluation of maximum a Posteriori vector quantization and gaussian mixture models in speaker verification

Pattern Recognition Letters
Support vector machines and Joint Factor Analysis for speaker verification

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Joint map adaptation of feature transformation and Gaussian Mixture Model for speaker recognition

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Comparison of scoring methods used in speaker recognition with Joint Factor Analysis

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Speaker recognition using syllable-based constraints for cepstral frame selection

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Modeling instantaneous intonation for speaker identification using the fundamental frequency variation spectrum

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
On separating glottal source and vocal tract information in telephony speaker verification

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
The I4U system in NIST 2008 speaker recognition evaluation

ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Particle swarm optimization for sorted adapted Gaussian mixture models

IEEE Transactions on Audio, Speech, and Language Processing
Fusion of acoustic and tokenization features for speaker recognition

ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006

IEEE Transactions on Audio, Speech, and Language Processing
Modeling Prosodic Features With Joint Factor Analysis for Speaker Verification

IEEE Transactions on Audio, Speech, and Language Processing
Spoken Language Recognition Using Ensemble Classifiers

IEEE Transactions on Audio, Speech, and Language Processing
Using Post-Classifiers to Enhance Fusion of Low- and High-Level Speaker Recognition

IEEE Transactions on Audio, Speech, and Language Processing
A Study of Interspeaker Variability in Speaker Verification

IEEE Transactions on Audio, Speech, and Language Processing
Real-time speaker identification and verification

IEEE Transactions on Audio, Speech, and Language Processing
Pseudo pitch synchronous analysis of speech with applications to speaker recognition

IEEE Transactions on Audio, Speech, and Language Processing
Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation

IEEE Transactions on Audio, Speech, and Language Processing
Speaker and Session Variability in GMM-Based Speaker Verification

IEEE Transactions on Audio, Speech, and Language Processing
An overview of automatic speaker diarization systems

IEEE Transactions on Audio, Speech, and Language Processing
ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications

IEEE Communications Magazine
An introduction to kernel-based learning algorithms

IEEE Transactions on Neural Networks
Unsupervised speaker recognition based on competition between self-organizing maps

IEEE Transactions on Neural Networks
Speaker verification with adaptive spectral subband centroids

ICB'07 Proceedings of the 2007 international conference on Advances in Biometrics

Towards task-independent person authentication using eye movement signals

Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications
On the use of perceptual Line Spectral pairs Frequencies and higher-order residual moments for Speaker Identification

International Journal of Biometrics
Adaptive phoneme alignment based on rough set theory

RSCTC'10 Proceedings of the 7th international conference on Rough sets and current trends in computing
Comparison of the impact of some Minkowski metrics on VQ/GMM based speaker recognition

Computers and Electrical Engineering
Visual-audio integration for user authentication system of partner robots

ICIRA'10 Proceedings of the Third international conference on Intelligent robotics and applications - Volume Part II
Effects of long-term ageing on speaker verification

BioID'11 Proceedings of the COST 2101 European conference on Biometrics and ID management
Text-independent speaker identification using Radon and discrete cosine transforms based features from speech spectrogram

Pattern Recognition
Application of speaker- and language identification state-of-the-art techniques for emotion recognition

Speech Communication
Comparison of clustering methods: A case study of text-independent speaker modeling

Pattern Recognition Letters
Applying the data fusion technique to blog opinion retrieval

Expert Systems with Applications: An International Journal
Speaker verification under degraded condition: a perceptual study

International Journal of Speech Technology
Environmental robust speech and speaker recognition through multi-channel histogram equalization

Neurocomputing
Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition

Speech Communication
Synthetic on-line signature generation. Part I: Methodology and algorithms

Pattern Recognition
Score fusion in text-dependent speaker recognition systems

COST'10 Proceedings of the 2010 international conference on Analysis of Verbal and Nonverbal Communication and Enactment
Investigation of broadcast-audio semantic analysis scenarios employing radio-programme-adaptive pattern classification

Speech Communication
Variational conditional random fields for online speaker detection and tracking

Speech Communication
Speaker verification using excitation source information

International Journal of Speech Technology
A review on speaker diarization systems and approaches

Speech Communication
Recognising speakers from the topics they talk about

Speech Communication
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval
On the development of an automatic voice pleasantness classification and intensity estimation system

Computer Speech and Language
The weighted Condorcet fusion in information retrieval

Information Processing and Management: an International Journal
Average framing linear prediction coding with wavelet transform for text-independent speaker identification system

Computers and Electrical Engineering
Privacy-Preserving speaker authentication

ISC'12 Proceedings of the 15th international conference on Information Security
Development and evaluation of online text-independent speaker verification system for remote person authentication

International Journal of Speech Technology
Investigating fusion approaches in multi-biometric cancellable recognition

Expert Systems with Applications: An International Journal
Toward emotional speaker recognition: framework and preliminary results

CCBR'12 Proceedings of the 7th Chinese conference on Biometric Recognition
Gender-dependent emotion recognition based on HMMs and SPHMMs

International Journal of Speech Technology
Fractional Fourier transform based features for speaker recognition using support vector machine

Computers and Electrical Engineering
Investigation of the effect of data duration and speaker gender on text-independent speaker recognition

Computers and Electrical Engineering
i-Vector with sparse representation classification for speaker verification

Speech Communication
Speaker verification in score-ageing-quality classification space

Computer Speech and Language
Comparison between supervised and unsupervised learning of probabilistic linear discriminant analysis mixture models for speaker verification

Pattern Recognition Letters
Optimization of the parameters characterizing sigmoidal rate-level functions based on acoustic features

Speech Communication
Employing both gender and emotion cues to enhance speaker identification performance in emotional talking environments

International Journal of Speech Technology
Recognizing the message and the messenger: biomimetic spectral analysis for robust speech and speaker recognition

International Journal of Speech Technology
Filter-based optimization techniques for selection of feature subsets in ensemble systems

Expert Systems with Applications: An International Journal
A nonlinear autoregressive model for speaker verification

International Journal of Speech Technology
Characterizing Phonetic Transformations and Acoustic Differences Across English Dialects

IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper gives an overview of automatic speaker recognition technology, with an emphasis on text-independent recognition. Speaker recognition has been studied actively for several decades. We give an overview of both the classical and the state-of-the-art methods. We start with the fundamentals of automatic speaker recognition, concerning feature extraction and speaker modeling. We elaborate advanced computational techniques to address robustness and session variability. The recent progress from vectors towards supervectors opens up a new area of exploration and represents a technology trend. We also provide an overview of this recent development and discuss the evaluation methodology of speaker recognition systems. We conclude the paper with discussion on future directions.