Highly-inflected language generation using factored language models

Authors:
Eder Miranda de Novais;Ivandré Paraboni;Diogo Takaki Ferreira
Affiliations:
School of Arts, Sciences and Humanities, University of São Paulo, São Paulo, Brazil;School of Arts, Sciences and Humanities, University of São Paulo, São Paulo, Brazil;School of Arts, Sciences and Humanities, University of São Paulo, São Paulo, Brazil
Venue:
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part I
Year:
2011

Citing 11
Cited 0

Squibs and discussions: human variation and lexical choice

Computational Linguistics - Summarization
Forest-based statistical sentence generation

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
BLEU: a method for automatic evaluation of machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Factored language models and generalized parallel backoff

NAACL-Short '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers - Volume 2
The order of prenominal adjectives in natural language generation

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Corpus-based lexical choice in natural language generation

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models

Natural Language Engineering
An architecture for data-to-text systems

ENLG '07 Proceedings of the Eleventh European Workshop on Natural Language Generation
Class-based ordering of prenominal modifiers

ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
SimpleNLG: a realisation engine for practical applications

ENLG '09 Proceedings of the 12th European Workshop on Natural Language Generation
Improved text generation using n-gram statistics

IBERAMIA'10 Proceedings of the 12th Ibero-American conference on Advances in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Statistical language models based on n-gram counts have been shown to successfully replace grammar rules in standard 2-stage (or 'generate-and-select') Natural Language Generation (NLG). In highlyinflected languages, however, the amount of training data required to cope with n-gram sparseness may be simply unobtainable, and the benefits of a statistical approach become less obvious. In this work we address the issue of text generation in a highly-inflected language by making use of factored language models (FLM) that take morphological information into account. We present a number of experiments involving the use of simple FLMs applied to various surface realisation tasks, showing that FLMs may implement 2-stage generation with results that are far superior to standard n-gram models alone.