A comparison of procedures for the calculation of forensic likelihood ratios from acoustic-phonetic data: Multivariate kernel density (MVKD) versus Gaussian mixture model-universal background model (GMM-UBM)

Authors:
Geoffrey Stewart Morrison
Affiliations:
School of Language Studies, Australian National University, Canberra, ACT 0200, Australia and Forensic Voice Comparison Laboratory, School of Electrical Engineering & Telecommunications, Universit ...
Venue:
Speech Communication
Year:
2011

Citing 5
Cited 1

The inference of identity in forensic speaker recognition

Speech Communication - Speaker recognition and its commercial and forensic applications
An Introduction to Application-Independent Evaluation of Speaker Recognition Systems

Speaker Classification I
An Anticorrelation Kernel for Subsystem Training in Multiple Classifier Systems

The Journal of Machine Learning Research
Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006

IEEE Transactions on Audio, Speech, and Language Processing
Emulating DNA: Rigorous Quantification of Evidential Weight in Transparent and Testable Forensic Speaker Recognition

IEEE Transactions on Audio, Speech, and Language Processing

Effects of telephone transmission on the performance of formant-trajectory-based forensic voice comparison - Female voices

Speech Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

Two procedures for the calculation of forensic likelihood ratios were tested on the same set of acoustic-phonetic data. One procedure was a multivariate kernel density procedure (MVKD) which is common in acoustic-phonetic forensic voice comparison, and the other was a Gaussian mixture model-universal background model (GMM-UBM) which is common in automatic forensic voice comparison. The data were coefficient values from discrete cosine transforms fitted to second-formant trajectories of /a@?/, /e@?/, /o@?/, /a@?/, and /@?@?/ tokens produced by 27 male speakers of Australian English. Scores were calculated separately for each phoneme and then fused using logistic regression. The performance of the fused GMM-UBM system was much better than that of the fused MVKD system, both in terms of accuracy (as measured using the log-likelihood-ratio cost, C"l"l"r) and precision (as measured using an empirical estimate of the 95% credible interval for the likelihood ratios from the different-speaker comparisons).