Corpus-based methods in natural language generation: friend or foe?

Authors:
Owen Rambow
Affiliations:
AT&T Labs -- Research, Florham Park, NJ
Venue:
EWNLG '01 Proceedings of the 8th European workshop on Natural Language Generation - Volume 8
Year:
2001

Citing 3
Cited 1

Two-level, many-paths generation

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Evaluating a trainable sentence planner for a spoken dialogue system

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Evaluation metrics for generation

INLG '00 Proceedings of the first international conference on Natural language generation - Volume 14

Towards an optimal lexicalization in a natural-sounding portable natural language generator for dialog systems

ACLstudent '05 Proceedings of the ACL Student Research Workshop

Quantified Score

Hi-index	0.00

Visualization

Abstract

In computational linguistics, the 1990s were characterized by the rapid rise to prominence of corpus-based methods in natural language understanding (NLU). These methods include statistical and machine-learning and approaches. In natural language generation (NLG), in the mean time, there was little work using statistical and machine learning approaches. Some researchers felt that the kind of ambiguities that appeared to profit from corpus-based approaches in NLU did not exist in NLG: if the input is adequately specified, then all the rules that map to a correct output can also be explicitly specified. However, this paper will argue that this view is not correct, and NLG can and does profit from corpus-based methods. The resistance to corpus-based approaches in NLG may have more to do with the fact that in many NLG applications (such as report or description generation) the output to be generated is extremely limited. As is the case with NLU, if the language is limited, hand-crafted methods are adequate and successful. Thus, it is not a surprise that the first use of corpus-based techniques, at ISI (Knight and Hatzivassiloglou, 1995; Langkilde and Knight, 1998) was motivated by the use of NLG not in "traditional" NLG applications, but in machine translation, in which the range of output language is (potentially) much larger.