Ten lectures on wavelets
Characterization of Signals from Multiscale Edges
IEEE Transactions on Pattern Analysis and Machine Intelligence
Voice transformation using PSOLA technique
Speech Communication - Eurospeech '91
On artificial bandwidth extension of telephone speech
Signal Processing - Special section: Hans Wilhelm Schüßler celebrates his 75th birthday
Data-driven voice soruce waveform modelling
ICASSP '09 Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Estimation of Glottal Closure Instants in Voiced Speech Using the DYPSA Algorithm
IEEE Transactions on Audio, Speech, and Language Processing
Epoch Extraction From Speech Signals
IEEE Transactions on Audio, Speech, and Language Processing
A quantitative assessment of group delay methods for identifying glottal closures in voiced speech
IEEE Transactions on Audio, Speech, and Language Processing
Analysis of multiscale products for step detection and estimation
IEEE Transactions on Information Theory
Evaluation of glottal closure instant detection in a range of voice qualities
Speech Communication
Automating manual user strategies for precise voice source analysis
Speech Communication
Hi-index | 0.00 |
Accurate estimation of glottal closure instants (GCIs) and opening instants (GOIs) is important for speech processing applications that benefit from glottal-synchronous processing. The majority of existing approaches detect GCIs by comparing the differentiated EGG signal to a threshold and are able to provide accurate results during voiced speech. More recent algorithms use a similar approach across multiple dyadic scales using the stationary wavelet transform. All existing approaches are however prone to errors around the transition regions at the end of voiced segments of speech. This paper describes a new method for EGG-based glottal activity detection which exhibits high accuracy over the entirety of voiced segments, including, in particular, the transition regions, thereby giving significant improvement over existing methods. Following a stationary wavelet transform-based preprocessor, detection of excitation due to glottal closure is performed using a group delay function and then true and false detections are discriminated by Gaussian mixture modeling. GOI detection involves additional processing using the estimated GCIs. The main purpose of our algorithm is to provide a ground-truth for GCIs and GOIs. This is essential in order to evaluate algorithms that estimate GCIs and GOIs from the speech signal only, and is also of high value in the analysis of pathological speech where knowledge of GCIs and GOIs is often needed. We compare our algorithm with two previous algorithms against a hand-labeled database. Evaluation has shown an average GCI hit rate of 99.47% and GOI of 99.35%, compared to 96.08 and 92.54 for the best-performing existing algorithm.