Optimal Partitioning for Classification and Regression Trees
IEEE Transactions on Pattern Analysis and Machine Intelligence
A systematic comparison of various statistical alignment models
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
An efficient method for determining bilingual word classes
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
A probabilistic framework for semi-supervised clustering
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Statistical Machine Translation with Scarce Resources Using Morpho-syntactic Information
Computational Linguistics
Improving statistical MT through morphological analysis
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
On the impact of morphology in English to Spanish statistical MT
Speech Communication
Predicting success in machine translation
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Exploring different representational units in English-to-Turkish statistical machine translation
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
The 'noisier channel': translation from morphologically complex languages
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
English-to-Czech factored machine translation
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Statistical machine translation into a morphologically complex language
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Unsupervised search for the optimal segmentation for statistical machine translation
ACLstudent '10 Proceedings of the ACL 2010 Student Research Workshop
IEEE Transactions on Audio, Speech, and Language Processing
Factored bilingual n-gram language models for statistical machine translation
Machine Translation
Combining morpheme-based machine translation with post-processing morpheme prediction
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Translating from morphologically complex languages: a paraphrase-based approach
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Agreement constraints for statistical machine translation into German
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Machine translation without words through substring alignment
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Substring-based machine translation
Machine Translation
Hi-index | 0.00 |
Certain distinctions made in the lexicon of one language may be redundant when translating into another language. We quantify redundancy among source types by the similarity of their distributions over target types. We propose a language-independent framework for minimising lexical redundancy that can be optimised directly from parallel text. Optimisation of the source lexicon for a given target language is viewed as model selection over a set of cluster-based translation models.Redundant distinctions between types may exhibit monolingual regularities, for example, inflexion patterns. We define a prior over model structure using a Markov random field and learn features over sets of monolingual types that are predictive of bilingual redundancy. The prior makes model selection more robust without the need for language-specific assumptions regarding redundancy. Using these models in a phrase-based SMT system, we show significant improvements in translation quality for certain language pairs.