String transformation learning

  • Authors:
  • Giorgio Satta;John C. Henderson

  • Affiliations:
  • Università di Padova, Padova, Italy;Johns Hopkins University, Baltimore, MD

  • Venue:
  • ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
  • Year:
  • 1997

Quantified Score

Hi-index 0.01

Visualization

Abstract

String transformation systems have been introduced in (Brill, 1995) and have several applications in natural language processing. In this work we consider the computational problem of automatically learning from a given corpus the set of transformations presenting the best evidence. We introduce an original data structure and efficient algorithms that learn some families of transformations that are relevant for part-of-speech tagging and phonological rule systems. We also show that the same learning problem becomes NP-hard in cases of an unbounded use of don't care symbols in a transformation.