Modelling auditory processing and organisation
Modelling auditory processing and organisation
Robust automatic speech recognition with missing and unreliable acoustic data
Speech Communication
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
Monaural speech segregation based on pitch tracking and amplitude modulation
IEEE Transactions on Neural Networks
A computational model of auditory selective attention
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
A speech separation system is described in which sources are represented in a joint interaural time difference-fundamental frequency (ITD-F0) cue space. Traditionally, recurrent timing neural networks (RTNNs) have been used only to extract periodicity information; in this study, this type of network is extended in two ways. Firstly, a coincidence detector layer is introduced, each node of which is tuned to a particular ITD; secondly, the RTNN is extended to become two-dimensional to allow periodicity analysis to be performed at each best-ITD. Thus, one axis of the RTNN represents F0 and the other ITD allowing sources to be segregated on the basis of their separation in ITD-F0 space. Source segregation is performed within individual frequency channels without recourse to across-channel estimates of F0 or ITD that are commonly used in auditory scene analysis approaches. The system is evaluated on spatialised speech signals using energy-based metrics and automatic speech recognition.