Active learning with semi-automatic annotation for extractive speech summarization

  • Authors:
  • Justin Jian Zhang;Pascale Fung

  • Affiliations:
  • Dongguan University of Technology, Dongguan, China and The Hong Kong University of Science and Technology;The Hong Kong University of Science and Technology

  • Venue:
  • ACM Transactions on Speech and Language Processing (TSLP)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose using active learning for extractive speech summarization in order to reduce human effort in generating reference summaries. Active learning chooses a selective set of samples to be labeled. We propose a combination of informativeness and representativeness criteria for selection. We further propose a semi-automatic method to generate reference summaries for presentation speech by using Relaxed Dynamic Time Warping (RDTW) alignment between presentation speech and its accompanied slides. Our summarization results show that the amount of labeled data needed for a given summarization accuracy can be reduced by more than 23% compared to random sampling.