Fertility models for statistical natural language understanding

Authors:
Stephen Della Pietra;Mark Epstein;Salim Roukos;Todd Ward
Affiliations:
IBM Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Thomas J. Watson Research Center, Yorktown Heights, NY;IBM Thomas J. Watson Research Center, Yorktown Heights, NY
Venue:
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Year:
1997

Citing 6
Cited 2

The ATIS spoken language systems pilot corpus

HLT '90 Proceedings of the workshop on Speech and Natural Language
The Application of Semantic Classification Trees to Natural Language Understanding

IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical source channel models for natural language understanding

Statistical source channel models for natural language understanding
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
A fully statistical approach to natural language interfaces

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Statistical natural language understanding using hidden clumpings

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01

A multiple-application conversational agent

Proceedings of the 9th international conference on Intelligent user interfaces
Spoken language understanding using weakly supervised learning

Computer Speech and Language

Quantified Score

Hi-index	0.00

Visualization

Abstract

Several recent efforts in statistical natural language understanding (NLU) have focused on generating clumps of English words from semantic meaning concepts (Miller et al., 1995; Levin and Pieracini, 1995; Epstein et al., 1996; Epstein, 1996). This paper extends the IBM Machine Translation Group's concept of fertility (Brown et al., 1993) to the generation of clumps for natural language understanding. The basic underlying intuition is that a single concept may be expressed in English as many disjoint clump of words. We present two fertility models which attempt to capture this phenomenon. The first is a Poisson model which leads to appealing computational simplicity. The second is a general nonparametric fertility model. The general model's parameters are boot-strapped from the Poisson model and updated by the EM algorithm. These fertility models can be used to impose clump fertility structure on top of preexisting clump generation models. Here, we present results for adding fertility structure to unigram, bigram, and headword clump generation models on ARPA's Air Travel Information Service (ATIS) domain.