Improving supervised learning for meeting summarization using sampling and regression

  • Authors:
  • Shasha Xie;Yang Liu

  • Affiliations:
  • Department of Computer Science, The University of Texas at Dallas, Richardson 75080, USA;Department of Computer Science, The University of Texas at Dallas, Richardson 75080, USA

  • Venue:
  • Computer Speech and Language
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Meeting summarization provides a concise and informative summary for the lengthy meetings and is an effective tool for efficient information access. In this paper, we focus on extractive summarization, where salient sentences are selected from the meeting transcripts to form a summary. We adopt a supervised learning approach for this task and use a classifier to determine whether to select a sentence in the summary based on a rich set of features. We address two important problems associated with this supervised classification approach. First we propose different sampling methods to deal with the imbalanced data problem for this task where the summary sentences are the minority class. Second, in order to account for human disagreement for summary annotation, we reframe the extractive summarization task using a regression scheme instead of binary classification. We evaluate our approaches using the ICSI meeting corpus on both the human transcripts and speech recognition output, and show performance improvement using different sampling methods and regression model.