An investigation of code-switching attitude dependent language modeling

  • Authors:
  • Ngoc Thang Vu;Heike Adel;Tanja Schultz

  • Affiliations:
  • Institute for Anthropomatics, Karlsruhe Institute of Technology (KIT), Germany;Institute for Anthropomatics, Karlsruhe Institute of Technology (KIT), Germany;Institute for Anthropomatics, Karlsruhe Institute of Technology (KIT), Germany

  • Venue:
  • SLSP'13 Proceedings of the First international conference on Statistical Language and Speech Processing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we investigate the adaptation of language modeling for conversational Mandarin-English Code-Switching (CS) speech and its effect on speech recognition performance. First, we investigate the prediction of code switches based on textual features with focus on Part-of-Speech (POS) tags. We show that the switching attitude is speaker dependent and utilize this finding to cluster the training speakers into classes with similar switching attitude. Second, we apply recurrent neural network language models which integrate the POS information into the input layer and factorize the output layer into languages for modeling CS. Furthermore, we adapt the background N-Gram and RNN language model to the different Code-Switching attitudes of the speaker clusters which lead to significant reductions in terms of perplexity. Finally, using these adapted language models we rerun the speech recognition system for each speaker and achieve improvements in terms of mixed error rate.