String transformation learning

Authors:
Giorgio Satta;John C. Henderson
Affiliations:
Università di Padova, Padova, Italy;Johns Hopkins University, Baltimore, MD
Venue:
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Year:
1997

Citing 6
Cited 3

Text algorithms

Text algorithms
Regular models of phonological rule systems

Computational Linguistics - Special issue on computational phonology
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
A Space-Economical Suffix Tree Construction Algorithm

Journal of the ACM (JACM)
Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text

Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness

Bootstrapping morphological analyzers by combining human elicitation and machine learning

Computational Linguistics
A memory-based approach to learning shallow natural language patterns

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Multipath translation lexicon induction via bridge languages

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies

Quantified Score

Hi-index	0.01

Visualization

Abstract

String transformation systems have been introduced in (Brill, 1995) and have several applications in natural language processing. In this work we consider the computational problem of automatically learning from a given corpus the set of transformations presenting the best evidence. We introduce an original data structure and efficient algorithms that learn some families of transformations that are relevant for part-of-speech tagging and phonological rule systems. We also show that the same learning problem becomes NP-hard in cases of an unbounded use of don't care symbols in a transformation.