Decision lists for lexical ambiguity resolution: application to accent restoration in Spanish and French

  • Authors:
  • David Yarowsky

  • Affiliations:
  • University of Pennsylvania, Philadelphia, PA

  • Venue:
  • ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
  • Year:
  • 1994

Quantified Score

Hi-index 0.02

Visualization

Abstract

This paper presents a statistical decision procedure for lexical ambiguity resolution. The algorithm exploits both local syntactic patterns and more distant collocational evidence, generating an efficient, effective, and highly perspicuous recipe for resolving a given ambiguity. By identifying and utilizing only the single best disambiguating evidence in a target context, the algorithm avoids the problematic complex modeling of statistical dependencies. Although directly applicable to a wide class of ambiguities, the algorithm is described and evaluated in a realistic case study, the problem of restoring missing accents in Spanish and French text. Current accuracy exceeds 99% on the full task, and typically is over 90% for even the most difficult ambiguities.