Coping with ambiguity and unknown words through probabilistic models

  • Authors:
  • Ralph Weischedel;Richard Schwartz;Jeff Palmucci;Marie Meteer;Lance Ramshaw

  • Affiliations:
  • BBN Systems and Technologies;BBN Systems and Technologies;BBN Systems and Technologies;Rensselaer Polytechnic Institute;Bowdoin College

  • Venue:
  • Computational Linguistics - Special issue on using large corpora: II
  • Year:
  • 1993

Quantified Score

Hi-index 0.00

Visualization

Abstract

From spring 1990 through fall 1991, we performed a battery of small experiments to test the effectiveness of supplementing knowledge-based techniques with probabilistic models. This paper reports our experiments in predicting parts of speech of highly ambiguous words, predicting the intended interpretation of an utterance when more than one interpretation satisfies all known syntactic and semantic constraints, and learning caseframe informationfor verbsfrom example uses.From these experiments, we are convinced that probabilistic models based on annotated corpora can effectively reduce the ambiguity in processing text and can be used to acquire lexical informationfrom a corpus, by supplementing knowledge-based techniques.Based on the results of those experiments, we have constructed a new natural language system (PLUM) for extracting data from text, e.g., newswire text.