Baseline acoustic models for brazilian portuguese using CMU sphinx tools

  • Authors:
  • Rafael Oliveira;Pedro Batista;Nelson Neto;Aldebaro Klautau

  • Affiliations:
  • Signal Processing Laboratory, Federal University of Pará, Belém, PA, Brazil;Signal Processing Laboratory, Federal University of Pará, Belém, PA, Brazil;Signal Processing Laboratory, Federal University of Pará, Belém, PA, Brazil;Signal Processing Laboratory, Federal University of Pará, Belém, PA, Brazil

  • Venue:
  • PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Advances in speech processing research rely on the availability of public resources such as corpora, statistical models and baseline systems. In contrast to languages such as English, there are few specific resources for Brazilian Portuguese. This work describes efforts aiming to decrease such gap. Baseline acoustic models for Brazilian Portuguese were built using the CMU Sphinx toolkit and public domain resources: speech corpora, phonetic dictionary and language model. Experiments were carried on for dictation and grammar tasks and the obtained results can be used to support further researches. Part of the trained acoustic models and a reference speech corpus were made publicly available.