Characteristics-based effective applause detection for meeting speech

Authors:
Yan-Xiong Li;Qian-Hua He;Sam Kwong;Tao Li;Ji-Chen Yang
Affiliations:
School of Electronic and Information Engineering, South China University of Technology, 381 Wushan Road, Guangzhou 510640, Guangdong Province, China;School of Electronic and Information Engineering, South China University of Technology, 381 Wushan Road, Guangzhou 510640, Guangdong Province, China;Department of Computer Science, City University of Hong Kong, 83 Tat Chee Ave., Kowloon, Hong Kong, China;School of Electronic and Information Engineering, South China University of Technology, 381 Wushan Road, Guangzhou 510640, Guangdong Province, China;School of Electronic and Information Engineering, South China University of Technology, 381 Wushan Road, Guangzhou 510640, Guangdong Province, China
Venue:
Signal Processing
Year:
2009

Citing 8
Cited 1

Fundamentals of speech recognition

Fundamentals of speech recognition
Selection, parameter estimation, and discriminative training of hidden Markov models for general audio modeling

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 2
Multimodal summarization of meeting recordings

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 3 (ICME '03) - Volume 03
Highlight sound effects detection in audio stream

ICME '03 Proceedings of the 2003 International Conference on Multimedia and Expo - Volume 3 (ICME '03) - Volume 03
Discriminative Feature Selection for Applause Sounds Detection

WIAMIS '07 Proceedings of the Eight International Workshop on Image Analysis for Multimedia Interactive Services
A Novel Detection Method of Filled Pause in Mandarin Spontaneous Speech

ICIS '08 Proceedings of the Seventh IEEE/ACIS International Conference on Computer and Information Science (icis 2008)
Stream-based classification and segmentation of speech events in meeting recordings

MRCS'06 Proceedings of the 2006 international conference on Multimedia Content Representation, Classification and Security
A flexible framework for key audio effects detection and auditory context inference

IEEE Transactions on Audio, Speech, and Language Processing

Detecting laughter in spontaneous speech by constructing laughter bouts

International Journal of Speech Technology

Quantified Score

Hi-index	0.08

Visualization

Abstract

Applause frequently occurs in multi-participants meeting speech. In fact, detecting applause is quite important for meeting speech recognition, semantic inference, highlight extraction, etc. In this paper, we will first study the characteristic differences between applause and speech, such as duration, pitch, spectrogram and occurrence locations. Then, an effective algorithm based on these characteristics is proposed for detecting applause in meeting speech stream. In the algorithm, the non-silence signal segments are first extracted by using voice activity detection. Afterward, applause segments are detected from the non-silence signal segments based on the characteristic differences between applause and speech without using any complex statistical models, such as hidden Markov models. The proposed algorithm can accurately determine the boundaries of applause in meeting speech stream, and is also computationally efficient. In addition, it can extract applause sub-segments from the mixed segments. Experimental evaluations show that the proposed algorithm can achieve satisfactory results in detecting applause of the meeting speech. Precision rate, recall rate, and F1-measure are 94.34%, 98.04%, and 96.15%, respectively. When compared with the traditional algorithm under the same experimental conditions, 3.62% improvement in F1-measure is achieved, and about 35.78% of computational time is saved.