Fundamentals of speech recognition
Fundamentals of speech recognition
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Multimodal summarization of meeting recordings
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 3 (ICME '03) - Volume 03
Highlight sound effects detection in audio stream
ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 3 (ICME '03) - Volume 03
Discriminative Feature Selection for Applause Sounds Detection
WIAMIS '07 Proceedings of the Eight International Workshop on Image Analysis for Multimedia Interactive Services
A Novel Detection Method of Filled Pause in Mandarin Spontaneous Speech
ICIS '08 Proceedings of the Seventh IEEE/ACIS International Conference on Computer and Information Science (icis 2008)
Stream-based classification and segmentation of speech events in meeting recordings
MRCS'06 Proceedings of the 2006 international conference on Multimedia Content Representation, Classification and Security
A flexible framework for key audio effects detection and auditory context inference
IEEE Transactions on Audio, Speech, and Language Processing
Detecting laughter in spontaneous speech by constructing laughter bouts
International Journal of Speech Technology
Hi-index | 0.08 |
Applause frequently occurs in multi-participants meeting speech. In fact, detecting applause is quite important for meeting speech recognition, semantic inference, highlight extraction, etc. In this paper, we will first study the characteristic differences between applause and speech, such as duration, pitch, spectrogram and occurrence locations. Then, an effective algorithm based on these characteristics is proposed for detecting applause in meeting speech stream. In the algorithm, the non-silence signal segments are first extracted by using voice activity detection. Afterward, applause segments are detected from the non-silence signal segments based on the characteristic differences between applause and speech without using any complex statistical models, such as hidden Markov models. The proposed algorithm can accurately determine the boundaries of applause in meeting speech stream, and is also computationally efficient. In addition, it can extract applause sub-segments from the mixed segments. Experimental evaluations show that the proposed algorithm can achieve satisfactory results in detecting applause of the meeting speech. Precision rate, recall rate, and F1-measure are 94.34%, 98.04%, and 96.15%, respectively. When compared with the traditional algorithm under the same experimental conditions, 3.62% improvement in F1-measure is achieved, and about 35.78% of computational time is saved.