Assessing prosodic and text features for segmentation of Mandarin broadcast news

Authors:
Gina-Anne Levow
Affiliations:
University of Chicago
Venue:
SpeechIR '04 Proceedings of the Workshop on Interdisciplinary Approaches to Speech Indexing and Retrieval at HLT-NAACL 2004
Year:
2004

Citing 7
Cited 2

Attention, intentions, and the structure of discourse

Computational Linguistics
C4.5: programs for machine learning

C4.5: programs for machine learning
Statistical Models for Text Segmentation

Machine Learning - Special issue on natural language learning
The Theory and Practice of Discourse Parsing and Summarization

The Theory and Practice of Discourse Parsing and Summarization
Integrating prosodic and lexical cues for automatic topic segmentation

Computational Linguistics
Multi-paragraph segmentation of expository text

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Prosody-based topic segmentation for Mandarin broadcast news

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers

Story segmentation of brodcast news in English, Mandarin and Arabic

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
On the effectiveness of subwords for lexical cohesion based story segmentation of Chinese broadcast news

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic topic segmentation, separation of a discourse stream into its constituent stories or topics, is a necessary preprocessing step for applications such as information retrieval, anaphora resolution, and summarization. While significant progress has been made in this area for text sources and for English audio sources, little work has been done in automatic segmentation of other languages using both text and acoustic information. In this paper, we focus on exploiting both textual and prosodic features for topic segmentation of Mandarin Chinese. As a tone language, Mandarin presents special challenges for applicability of intonation-based techniques, since the pitch contour is also used to establish lexical identity. However, intonational cues such as reduction in pitch and intensity at topic boundaries and increase in duration and pause still provide significant contrasts in Mandarin Chinese. We first build a decision tree classifier that based only on prosodic information achieves boundary classification accuracy of 89--95.8% on a large standard test set. We then contrast these results with a simple text similarity-based classification scheme. Finally we build a merged classifier, finding the best effectiveness for systems integrating text and prosodic cues.