Automatic broadcast news speech summarization

  • Authors:
  • Julia Hirschberg;Sameer Raj Maskey

  • Affiliations:
  • Columbia University;Columbia University

  • Venue:
  • Automatic broadcast news speech summarization
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

As the numbers of speech and video documents available on the web and on handheld devices soar to new levels, it becomes increasingly important to enable users to find relevant, significant and interesting parts of the documents automatically. In this dissertation, we present a system for summarizing Broadcast News (BN), ConciseSpeech, that identifies important segments of speech using lexical, acoustic/prosodic, and structural information, and combines them, optimizing significance, length and redundancy of the summary. There are many obstacles particular to speech such as word errors, disfluencies and the lack of segmentation that make speech summarization challenging. We present methods to address these problems. We show the use of Automatic Speech Recognition (ASR) confidence scores to compensate for word errors; present a phrase-level machine translation approach using weighted finite state transducers for detecting disfluency; and present the possibility of using intonational phrase segments for summarization. We also describe structural properties of BN used in determining which segments should be selected for a summary, including speaker roles, soundbites and commercials. We present Information Extraction (IE) techniques based on statistical methods such as conditional random fields and decision trees to automatically identify such structural properties. ConciseSpeech was built for handling single spoken documents, but we have extended it to handle user queries that can summarize multiple documents. For the query-focused version of ConciseSpeech we also built a knowledge resource (NE-NET) that can find related named entities to significantly improve the document retrieval task of query-focused summarization. We show how all these techniques improve speech summarization when compared to traditional text-based methods applied to speech transcripts.