Likelihood ratio calculation for a disputed-utterance analysis with limited available data

Authors:
Geoffrey Stewart Morrison;Jonas Lindh;James M Curran
Affiliations:
Forensic Voice Comparison Laboratory, School of Electrical Engineering & Telecommunications, University of New South Wales, UNSW Sydney, NSW 2052, Australia;Division of Speech and Language Pathology, Department of Clinical Neuroscience and Rehabilitation, Institute of Neuroscience and Physiology, Sahlgrenska Academy, University of Gothenburg, Box 452 ...;Department of Statistics, University of Auckland, Private Bag 92019, Auckland 1142, New Zealand
Venue:
Speech Communication
Year:
2014

Citing 3
Cited 0

Convergence Properties of the Nelder--Mead Simplex Method in Low Dimensions

SIAM Journal on Optimization
Computation of Multivariate Normal and t Probabilities

Computation of Multivariate Normal and t Probabilities
The SweDat Project and Swedia Database for Phonetic and Acoustic Research

E-SCIENCE '09 Proceedings of the 2009 Fifth IEEE International Conference on e-Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a disputed-utterance analysis using relevant data, quantitative measurements and statistical models to calculate likelihood ratios. The acoustic data were taken from an actual forensic case in which the amount of data available to train the statistical models was small and the data point from the disputed word was far out on the tail of one of the modelled distributions. A procedure based on single multivariate Gaussian models for each hypothesis led to an unrealistically high likelihood ratio value with extremely poor reliability, but a procedure based on Hotelling's T^2 statistic and a procedure based on calculating a posterior predictive density produced more acceptable results. The Hotelling's T^2 procedure attempts to take account of the sampling uncertainty of the mean vectors and covariance matrices due to the small number of tokens used to train the models, and the posterior-predictive-density analysis integrates out the values of the mean vectors and covariance matrices as nuisance parameters. Data scarcity is common in forensic speech science and we argue that it is important not to accept extremely large calculated likelihood ratios at face value, but to consider whether such values can be supported given the size of the available data and modelling constraints.