On the limited memory BFGS method for large scale optimization
Mathematical Programming: Series A and B
IEEE Transactions on Pattern Analysis and Machine Intelligence
Programming pearls: little languages
Communications of the ACM
Inference of Finite-State Transducers by Using Regular Grammars and Morphisms
ICGI '00 Proceedings of the 5th International Colloquium on Grammatical Inference: Algorithms and Applications
Computational Complexity of Problems on Probabilistic Grammars and Transducers
ICGI '00 Proceedings of the 5th International Colloquium on Grammatical Inference: Algorithms and Applications
Modeling and learning multilingual inflectional morphology in a minimally supervised framework
Modeling and learning multilingual inflectional morphology in a minimally supervised framework
Finite-state transducers in language and speech processing
Computational Linguistics
Efficient generation in primitive Optimality Theory
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Parameter estimation for probabilistic finite-state transducers
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Probabilistic CFG with latent annotations
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Moses: open source toolkit for statistical machine translation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
OpenFst: a general and efficient weighted finite-state transducer library
CIAA'07 Proceedings of the 12th international conference on Implementation and application of automata
Improved reconstruction of protolanguage word forms
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Loss-sensitive discriminative training of machine transliteration models
SRWS '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
A global model for joint lemmatization and part-of-speech prediction
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Graphical models over multiple strings
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Discriminative substring decoding for transliteration
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Maximum likelihood estimation of feature-based distributions
SIGMORPHON '10 Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology
Predicting the semantic compositionality of prefix verbs
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Entity disambiguation for knowledge base population
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Automated whole sentence grammar correction using a noisy channel model
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Research on Language and Computation
Bilingual random walk models for automated grammar correction of ESL author-produced text
IUNLPBEA '11 Proceedings of the 6th Workshop on Innovative Use of NLP for Building Educational Applications
Discovering morphological paradigms from plain text using a Dirichlet process mixture model
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Implicitly intersecting weighted automata using dual decomposition
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Bootstrapping a unified model of lexical and phonetic acquisition
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Name phylogeny: a generative model of string variation
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Information Sciences: an International Journal
Hi-index | 0.00 |
String-to-string transduction is a central problem in computational linguistics and natural language processing. It occurs in tasks as diverse as name transliteration, spelling correction, pronunciation modeling and inflectional morphology. We present a conditional loglinear model for string-to-string transduction, which employs overlapping features over latent alignment sequences, and which learns latent classes and latent string pair regions from incomplete training data. We evaluate our approach on morphological tasks and demonstrate that latent variables can dramatically improve results, even when trained on small data sets. On the task of generating morphological forms, we outperform a baseline method reducing the error rate by up to 48%. On a lemmatization task, we reduce the error rates in Wicentowski (2002) by 38--92%.