Towards using structural events to assess non-native speech

  • Authors:
  • Lei Chen;Joel Tetreault;Xiaoming Xi

  • Affiliations:
  • Educational Testing Service (ETS), Princeton, NJ;Educational Testing Service (ETS), Princeton, NJ;Educational Testing Service (ETS), Princeton, NJ

  • Venue:
  • IUNLPBEA '10 Proceedings of the NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigated using structural events, e.g., clause and disfluency structure, from transcriptions of spontaneous non-native speech, to compute features for measuring speaking proficiency. Using a set of transcribed audio files collected from the TOEFL Practice Test Online (TPO), we conducted a sophisticated annotation of structural events, including clause boundaries and types, as well as disfluencies. Based on words and the annotated structural events, we extracted features related to syntactic complexity, e.g., the mean length of clause (MLC) and dependent clause frequency (DEPC), and a feature related to disfluencies, the interruption point frequency per clause (IPC). Among these features, the IPC shows the highest correlation with holistic scores (r = -0.344). Furthermore, we increased the correlation with human scores by normalizing IPC by (1) MLC (r = -0.386), (2) DEPC (r = -0.429), and (3) both (r = -0.462). In this research, the features derived from structural events of speech transcriptions are found to predict holistic scores measuring speaking proficiency. This suggests that structural events estimated on speech word strings provide a potential way for assessing non-native speech.