Web-scale N-gram models for lexical disambiguation

  • Authors:
  • Shane Bergsma;Dekang Lin;Randy Goebel

  • Affiliations:
  • Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada;Google, Inc., Mountain View, California;Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada

  • Venue:
  • IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web-scale data has been used in a diverse range of language research. Most of this research has used web counts for only short, fixed spans of context. We present a unified view of using web counts for lexical disambiguation. Unlike previous approaches, our supervised and unsupervised systems combine information from multiple and overlapping segments of context. On the tasks of preposition selection and context-sensitive spelling correction, the supervised system reduces disambiguation error by 20-24% over the current state-of-the-art.