The use of emphasis to automatically summarize a spoken discourse

Authors:
Francine R. Chen;Margaret Withgott
Affiliations:
Xerox Palo Alto Research Center, Palo Alto, California;Xerox Palo Alto Research Center, Palo Alto, California
Venue:
ICASSP'92 Proceedings of the 1992 IEEE international conference on Acoustics, speech and signal processing - Volume 1
Year:
1992

Citing 0
Cited 4

Speaker segmentation for browsing recorded audio

CHI '95 Conference Companion on Human Factors in Computing Systems
Devising Interactive Access Techniques for Indian Language Document Images

ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 2
From English pitch accent detection to Mandarin stress detection, where is the difference?

Computer Speech and Language
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a method for exploiting prosodic information in natural, conversational speech for the purpose of automatically creating an audio summary. The method is based on identifying emphasized speech and then using proximity measures on the emphasized regions to select summarizing excerpts. Emphasized speech is recognized using a hidden Markov model employing only non-spectral, prosodic information. Syllable-based models were created and the models trained on spontaneous speech in which words had been labeled by a panel of listeners for degree of emphasis. Emphatic speech from one speaker was automatically detected and summarizing excerpts were identified, with no noticeable difference when compared to excerpts selected by individual subjects. The extensibility of the emphasis detector to other speakers was tested on a small sample of telephone speech by 10 other speakers.