Automatic sex identification from short segments of speech

Authors:
J. W. Fussell
Affiliations:
Dept. of Defense, Fort Meade, MD, USA
Venue:
ICASSP '91 Proceedings of the Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference
Year:
1991

Citing 0
Cited 1

Support Vector Learning for Gender Classification Using Audio and Visual Cues: A Comparison

SVM '02 Proceedings of the First International Workshop on Pattern Recognition with Support Vector Machines

Quantified Score

Hi-index	0.00

Visualization

Abstract

The performance of a sex identification system working on 16 ms segments of speech is discussed. Consideration is given to: the effectiveness of individual cepstral coefficients and frame-to-frame differences of those coefficients for sex identification; performance of the Gaussian classifier as a function of the amount of training data; the results of simplifications to the Gaussian classifier; performance when training and testing are done by phoneme, by phoneme-class, and on all speech; effects of training on one phoneme and testing on speech from different phonemes; and distribution of sex identification errors by speaker. It is concluded that, if the speech signal is of high quality, it should not be difficult to build a practical system which uses successive outputs from the single-frame Gaussian classifier to accurately classify a speaker as being male or female. Additional testing needs to be done on data which are not of such high quality.