The String-to-String Correction Problem
Journal of the ACM (JACM)
Learning Subsequential Transducers for Pattern Recognition Interpretation Tasks
IEEE Transactions on Pattern Analysis and Machine Intelligence
A discovery procedure for certain phonological rules
ACL '84 Proceedings of the 10th International Conference on Computational Linguistics and 22nd annual meeting on Association for Computational Linguistics
Compilation of weighted finite-state transducers from decision trees
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Re-engineering letter-to-sound rules
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Evolving finite state transducers: some initial explorations
EuroGP'03 Proceedings of the 6th European conference on Genetic programming
Classification Method for Learning Morpheme Analysis
Journal of Information Technology Research
Hi-index | 0.00 |
This paper presents a method for learning phonological rules from sample pairs of underlying and surface forms, without negative evidence. The learned rules are represented as finite state transducers that accept underlying forms as input and generate surface forms as output. The algorithm for learning them is an extension of the OSTIA algorithm for learning general subsequential finite state transducers. Although OSTIA is capable of learning arbitrary s.f.s.t's in the limit, large dictionaries of actual English pronunciations did not give enough samples to correctly induce phonological rules. We then augmented OSTIA with two kinds of knowledge specific to natural language phonology, biases from "universal grammar". One bias is that underlying phones are often realized as phonetically similar or identical surface phones. The other biases phonological rules to apply across natural phonological classes. The additions helped in learning more compact, accurate, and general transducers than the unmodified OSTIA algorithm. An implementation of the algorithm successfully learns a number of English postlexical rules.