Static representation of speech dynamics for isolated word recognition

Authors:
Chorkin Chan;Jian-Xiong Wu
Affiliations:
Department of Computer Science, University of Hong Kong, Hong Kong;Department of Computer Science, University of Hong Kong, Hong Kong
Venue:
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Year:
1992

Citing 3
Cited 0

Learning internal representations by error propagation

Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Recognition of phonetic labels of the TIMIT speech corpus by means of an artificial neural network

Pattern Recognition
White paper on spoken language systems

HLT '89 Proceedings of the workshop on Speech and Natural Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

A Static Model (SM) in the form of a single vector is proposed to represent the temporal properties of a sequence of speech feature vectors. In contrast to a Hidden Markov Model which captures the conditional probabilities of state transitions of consecutive observations Xt and Xt+1 over times of SM captures their average joint probabilities of belonging to a pair of phonetic classes ωi and ωj without any Markovian assumption. SM is tested with isolated words derived from the TIMIT database as well as artificially created words. The vocabulary is a subset of TIMIT consisting of twenty one words derived from the two "sa" sentences spoken by 420 speakers. The artificial vocabulary of 10 words is designed to study the limitations of SM. Experimental results indicate that apart from a rather mild limitation of SM in handling certain type of vocabulary. SM actually performs better than baselined Continuous Hidden Markov Models (CHMM) in terms of recognition rate as far as isolated word recognition is concerned and it takes only 60% of the time needed by CHMM in recognition.