Trainable methods for surface natural language generation

  • Authors:
  • Adwait Ratnaparkhi

  • Affiliations:
  • IBM TJ Watson Research Center, Yorktown Heights, NY

  • Venue:
  • NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present three systems for surface natural language generation that are trainable from annotated corpora. The first two systems, called NLG1 and NLG2, require a corpus marked only with domainspecific semantic attributes, while the last system, called NLG3, requires a corpus marked with both semantic attributes and syntactic dependency information. All systems attempt to produce a grammatical natural language phrase from a domain-specific semantic representation. NLG1 serves a baseline system and uses phrase frequencies to generate a whole phrase in one step, while NLG2 and NLG3 use maximum entropy probability models to individually generate each word in the phrase. The systems NLG2 and NLG3 learn to determine both the word choice and the word order of the phrase. We present experiments in which we generate phrases to describe flights in the air travel domain.