Communication and prosody: functional aspects of prosody
Speech Communication - Dialogue and prosody
TextTiling: segmenting text into multi-paragraph subtopic passages
Computational Linguistics
A prosodic analysis of discourse segments in direction-giving monologues
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
One story, one flow: Hidden Markov Story Models for multilingual multidocument summarization
ACM Transactions on Speech and Language Processing (TSLP)
Summarizing speech without text using hidden Markov models
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Speech summarization without lexical features for Mandarin broadcast news
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Extractive chinese spoken document summarization using probabilistic ranking models
ISCSLP'06 Proceedings of the 5th international conference on Chinese Spoken Language Processing
Summarizing speech by contextual reinforcement of important passages
PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language
Self reinforcement for important passage retrieval
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Computer Speech and Language
Hi-index | 0.00 |
We propose an extractive summarization approach with a novel shallow rhetorical structure learning framework for speech summarization. One of the most under-utilized features in extractive summarization is hierarchical structure information-semantically cohesive units that are hidden in spoken documents. We first present empirical evidence that rhetorical structure is the underlying semantic information, which is rendered in linguistic and acoustic/prosodic forms in lecture speech. A segmental summarization method, where the document is partitioned into rhetorical units by K-means clustering, is first proposed to test this hypothesis. We show that this system produces summaries at 67.36% ROUGE-L F-measure, a 4.29% absolute increase in performance compared with that of the baseline system. We then propose Rhetorical-State Hidden Markov Models (RSHMMs) to automatically decode the underlying hierarchical rhetorical structure in speech. Tenfold cross validation experiments are carried out on conference speeches. We show that system based on RSHMMs gives a 71.31% ROUGE-L F-measure, a 8.24% absolute increase in lecture speech summarization performance compared with the baseline system without using RSHMM. Our method equally outperforms the baseline with a conventional discourse feature. We also present a thorough investigation of the relative contribution of different features and show that, for lecture speech, speaker-normalized acoustic features give the most contribution at 68.5% ROUGE-L F-measure, compared to 62.9% ROUGE-L F-measure for linguistic features, and 59.2% ROUGE-L F-measure for un-normalized acoustic features. This shows that the individual speaking style of each speaker is highly relevant to the summarization.