A lightweight speech detection system for perceptive environments

Authors:
Dominique Vaufreydaz;Rémi Emonet;Patrick Reignier
Affiliations:
PRIMA – INRIA Rhône-Alpes, Zirst, Montbonnot, France;PRIMA – INRIA Rhône-Alpes, Zirst, Montbonnot, France;PRIMA – INRIA Rhône-Alpes, Zirst, Montbonnot, France
Venue:
MLMI'06 Proceedings of the Third international conference on Machine Learning for Multimodal Interaction
Year:
2006

Citing 4
Cited 0

Fundamentals of speech recognition

Fundamentals of speech recognition
Automatic detection of interaction groups

ICMI '05 Proceedings of the 7th international conference on Multimodal interfaces
The NESPOLE! speech-to-speech translation system

HLT '02 Proceedings of the second international conference on Human Language Technology Research
The “FAME” interactive space

MLMI'05 Proceedings of the Second international conference on Machine Learning for Multimodal Interaction

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we address the problem of speech activity detection in multimodal perceptive environments. Such space may contain many different microphones (lapel, distant or table top). Thus, we need a generic speech activity detector in order to cope with different speech conditions (from close-talking to noisy distant speech). Moreover, as the number of microphones in the room can be high, we also need a very light system. The speech activity detector presented in this article works efficiently on dozens of microphones in parallel. We will see that even if its absolute score of the evaluation is not perfect (30% and 40% of error rate respectively on the two tasks), its accuracy is good enough in the context we are using it.