Speech recognition in noisy environments: a survey
Speech Communication
Multi-microphone correlation-based processing for robust automatic speech recognition
Multi-microphone correlation-based processing for robust automatic speech recognition
Microphone array processing for robust speech recognition
Microphone array processing for robust speech recognition
Noise robustness in automatic speech recognition
Noise robustness in automatic speech recognition
The Lombard effect: a reflex to better communicate with others in noise
ICASSP '99 Proceedings of the Acoustics, Speech, and Signal Processing, 1999. on 1999 IEEE International Conference - Volume 04
Subjective comparison and evaluation of speech enhancement algorithms
Speech Communication
SWITCHBOARD: telephone speech corpus for research and development
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Enhancing spontaneous speech recognition with BLSTM features
NOLISP'11 Proceedings of the 5th international conference on Advances in nonlinear speech processing
A simple metric for turn-taking in emergent communication
Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
LSTM-Modeling of continuous emotions in an audiovisual affect recognition framework
Image and Vision Computing
Hi-index | 0.00 |
We present an overview of the data collection and transcription efforts for the COnversational Speech In Noisy Environments (COSINE) corpus. The corpus is a set of multi-party conversations recorded in real world environments with background noise. It can be used to train noise-robust speech recognition systems or develop speech de-noising algorithms. We explain the motivation for creating such a corpus, and describe the resulting audio recordings and transcriptions that comprise the corpus. These high quality recordings were captured in situ on a custom wearable recording system, whose design and construction is also described. On separate synchronized audio channels, seven-channel audio is captured with a 4-channel far-field microphone array, along with a close-talking, a monophonic far-field, and a throat microphone. This corpus thus creates many possibilities for speech algorithm research.