Approximation capabilities of multilayer feedforward networks
Neural Networks
Glottal wave analysis with Pitch Synchronous Iterative Adaptive Inverse Filtering
Speech Communication - Eurospeech '91
Parabolic spectral parameter—a new method for quantification of the glottal flow
Speech Communication
Pattern Recognition and Machine Learning (Information Science and Statistics)
Pattern Recognition and Machine Learning (Information Science and Statistics)
Spoken emotion recognition through optimum-path forest classification using glottal features
Computer Speech and Language
A review of glottal waveform analysis
Progress in nonlinear speech processing
A comparative study of glottal source estimation techniques
Computer Speech and Language
Comparative study: HMM and SVM for automatic articulatory feature extraction
IEA/AIE'06 Proceedings of the 19th international conference on Advances in Applied Artificial Intelligence: industrial, Engineering and Other Applications of Applied Intelligent Systems
Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation
IEEE Transactions on Audio, Speech, and Language Processing
HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering
IEEE Transactions on Audio, Speech, and Language Processing
Evaluation of glottal closure instant detection in a range of voice qualities
Speech Communication
Automating manual user strategies for precise voice source analysis
Speech Communication
Synthesis and perception of breathy, normal, and Lombard speech in the presence of noise
Computer Speech and Language
Hi-index | 0.00 |
The effectiveness of glottal source analysis is known to be dependent on the phonetic properties of its concomitant supraglottal features. Phonetic classes like nasals and fricatives are particularly problematic. Their acoustic characteristics, including zeros in the vocal tract spectrum and aperiodic noise, can have a negative effect on glottal inverse filtering, a necessary pre-requisite to glottal source analysis. In this paper, we first describe and evaluate a set of binary feature extractors, for phonetic classes with relevance for glottal source analysis. As voice quality classification is typically achieved using feature data derived by glottal source analysis, we then investigate the effect of removing data from certain detected phonetic regions on the classification accuracy. For the phonetic feature extraction, classification algorithms based on Artificial Neural Networks (ANNs), Gaussian Mixture Models (GMMs) and Support Vector Machines (SVMs) are compared. Experiments demonstrate that the discriminative classifiers (i.e. ANNs and SVMs) in general give better results compared with the generative learning algorithm (i.e. GMMs). This accuracy generally decreases according to the sparseness of the feature (e.g., accuracy is lower for nasals compared to syllabic regions). We find best classification of voice quality when just using glottal source parameter data derived within detected syllabic regions.