A rhetorical syntax-driven model for speech summarization

Authors:
Jian Zhang;Pascale Fung
Affiliations:
Hong Kong University of Science & Technology (HKUST);Hong Kong University of Science & Technology (HKUST)
Venue:
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Year:
2010

Citing 11
Cited 0

Auto-summarization of audio-video presentations

MULTIMEDIA '99 Proceedings of the seventh ACM international conference on Multimedia (Part 1)
Comparing presentation summaries: slides vs. reading vs. listening

Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Information Retrieval

Information Retrieval
Summarizing scientific articles: experiments with relevance and rhetorical status

Computational Linguistics - Summarization
TextTiling: segmenting text into multi-paragraph subtopic passages

Computational Linguistics
A prosodic analysis of discourse segments in direction-giving monologues

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Combining optimal clustering and Hidden Markov models for extractive summarization

MultiSumQA '03 Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering - Volume 12
One story, one flow: Hidden Markov Story Models for multilingual multidocument summarization

ACM Transactions on Speech and Language Processing (TSLP)
Prosodic correlates of rhetorical relations

ACTS '09 Proceedings of the HLT-NAACL 2006 Workshop on Analyzing Conversations in Text and Speech
Bayesian unsupervised topic segmentation

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Automatically generating Wikipedia articles: a structure-aware approach

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

We show a novel approach of parsing and reordering rhetorical syntax tree for extractive summarization of presentation speech. Our previous work showed (Fung et al., 2008) that rhetorical structures are embedded in this type of speech and that exploring this structure helps improve summarization quality. We further demonstrate that speakers do not follow the strict order of bullet points in the presentation slides, and that a re-ordering of these points occurs. We therefore propose a method of parsing presentation transcriptions into a rhetorical syntax tree and then re-order the leaf nodes to transform the speech transcriptions into an extractive summary akin to a process of presentation slide generation. Chunking, parsing, and reordering are carried out by 28-class Hidden Markov Support Vector Machine(HMSVM) classifier trained from reference presentations and presentation slides. Using ROUGE-L F-measure we showed that our rhetorical syntaxdriven model gives a 35.8% relative improvement over a binary summarizer with no rhetorical information, a 14.3% improvement over Rhetorical State Hidden Markov Model(RSHMM) (Fung et al., 2008), and a 4.3% improvement over our proposed model with no reordering.