Highly-inflected language generation using factored language models

  • Authors:
  • Eder Miranda de Novais;Ivandré Paraboni;Diogo Takaki Ferreira

  • Affiliations:
  • School of Arts, Sciences and Humanities, University of São Paulo, São Paulo, Brazil;School of Arts, Sciences and Humanities, University of São Paulo, São Paulo, Brazil;School of Arts, Sciences and Humanities, University of São Paulo, São Paulo, Brazil

  • Venue:
  • CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Statistical language models based on n-gram counts have been shown to successfully replace grammar rules in standard 2-stage (or 'generate-and-select') Natural Language Generation (NLG). In highlyinflected languages, however, the amount of training data required to cope with n-gram sparseness may be simply unobtainable, and the benefits of a statistical approach become less obvious. In this work we address the issue of text generation in a highly-inflected language by making use of factored language models (FLM) that take morphological information into account. We present a number of experiments involving the use of simple FLMs applied to various surface realisation tasks, showing that FLMs may implement 2-stage generation with results that are far superior to standard n-gram models alone.