Syntactic surprisal affects spoken word duration in conversational contexts

  • Authors:
  • Vera Demberg;Asad B. Sayeed;Philip J. Gorinski;Nikolaos Engonopoulos

  • Affiliations:
  • Saarland University, Saarbrücken, Germany;Saarland University, Saarbrücken, Germany;Saarland University, Saarbrücken, Germany;Saarland University, Saarbrücken, Germany

  • Venue:
  • EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present results of a novel experiment to investigate speech production in conversational data that links speech rate to information density. We provide the first evidence for an association between syntactic surprisal and word duration in recorded speech. Using the AMI corpus which contains transcriptions of focus group meetings with precise word durations, we show that word durations correlate with syntactic surprisal estimated from the incremental Roark parser over and above simpler measures, such as word duration estimated from a state-of-the-art text-to-speech system and word frequencies, and that the syntactic surprisal estimates are better predictors of word durations than a simpler version of surprisal based on trigram probabilities. This result supports the uniform information density (UID) hypothesis and points a way to more realistic artificial speech generation.