Building Language Models for Continuous Speech Recognition Systems

  • Authors:
  • Nuno Souto;Hugo Meinedo;João P. Neto

  • Affiliations:
  • -;-;-

  • Venue:
  • PorTAL '02 Proceedings of the Third International Conference on Advances in Natural Language Processing
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the work developed in the creation of language models for a continuous speech recognition system for the Portuguese language. First we discuss the process we use to create and update a text corpus based on newspaper editions collected from the Web from which we were able to generate N-gram language models. We also present the procedure we use to improve those models for a Broadcast News (BN) recognition task by interpolating them with a BN transcriptions based language model. Finally the paper details a method used to generate morpheme-based language models.