Wordform- and class-based prediction of the components of German nominal compounds in an AAC system

  • Authors:
  • Marco Baroni;Johannes Matiasek;Harald Trost

  • Affiliations:
  • Austrian Research Institute for Artificial Intelligence, Vienna, Austria;Austrian Research Institute for Artificial Intelligence, Vienna, Austria;University of Vienna, Vienna, Austria

  • Venue:
  • COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

In word prediction systems for augmentative and alternative communication (AAC), productive word-formation processes such as compounding pose a serious problem. We present a model that predicts German nominal compounds by splitting them into their modifier and head components, instead of trying to predict them as a whole. The model is improved further by the use of class-based modifier-head bigrams constructed using semantic classes automatically extracted from a corpus. The evaluation shows that the split compound model with class bigrams leads to an improvement in keystroke savings of more than 15% over a no split compound baseline model. We also present preliminary results obtained with a word prediction model integrating compound and simple word prediction.