On computational working memory for speech analysis

Authors:
Tudor S. Topoleanu
Affiliations:
"Transilvania" University of Brasov, Romania
Venue:
NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
Year:
2011

Citing 6
Cited 0

The 'Neural' Phonetic Typewriter

Computer
Ten lectures on wavelets

Ten lectures on wavelets
Neuromorphic detection of speech dynamics

Neurocomputing
Auditory-inspired sparse representation of audio signals

Speech Communication
An Implementation of Rational Wavelets and Filter Design for Phonetic Classification

IEEE Transactions on Audio, Speech, and Language Processing
Bayesian Separation With Sparsity Promotion in Perceptual Wavelet Domain for Speech Enhancement and Hybrid Speech Recognition

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a scheme for analysing speech data inspired by the concept of working memory, it is uses wavelet analysis and unsupervised learning models. The scheme relies on splitting a sound stream in arbitrary chunks and producing feature streams by sequentially analysing each chunk with time-frequency methods. The purpose of this is to precisely detect the time of transitions as well as length of stable acoustic units that occur between them. The procedure uses two feature extraction stages to analyse the audio chunk and two types of unsupervised machine learning models, hierarchical clustering and Self-Organising Maps. The first pass looks at the whole chunk piece by piece looking for speech and silence parts, this stage takes the root mean square, the arithmetic mean, standard deviation from the samples of each piece and classifies the features using hierarchical clustering into speech and non-speech clusters. The second pass looks for stable patterns and transitions at the locations inferred from the results of the first pass, this step uses Harmonic and Daubechies wavelets for coefficient extraction. After the analysis procedures have been completed the chunk advances 2 seconds, the transient and stable feature vectors are saved within SOMs and a new cycle begins on a new chunk.