The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Improving SMT quality with morpho-syntactic analysis
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Inducing multilingual text analysis tools via robust projection across aligned corpora
HLT '01 Proceedings of the first international conference on Human language technology research
A syntax-based statistical translation model
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Discriminative training and maximum entropy models for statistical machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
BLEU: a method for automatic evaluation of machine translation
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Language model based arabic word segmentation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
A projection extension algorithm for statistical machine translation
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Combination of Arabic preprocessing schemes for statistical machine translation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Improving statistical MT through morphological analysis
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Chinese word segmentation and statistical machine translation
ACM Transactions on Speech and Language Processing (TSLP)
Automatically identifying localizable queries
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
On the impact of morphology in English to Spanish statistical MT
Speech Communication
Boosting statistical machine translation by lemmatization and linear interpolation
ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Context-based Arabic morphological analysis for machine translation
CoNLL '08 Proceedings of the Twelfth Conference on Computational Natural Language Learning
Syntactic phrase reordering for English-to-Arabic statistical machine translation
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Arabic preprocessing schemes for statistical machine translation
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Bridging the inflection morphology gap for Arabic statistical machine translation
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Combination of statistical word alignments based on multiple preprocessing schemes
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Syntactic reordering for English-Arabic phrase-based machine translation
Semitic '09 Proceedings of the EACL 2009 Workshop on Computational Approaches to Semitic Languages
Exploring different representational units in English-to-Turkish statistical machine translation
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Improving Arabic-Chinese statistical machine translation using English as pivot language
StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Phrase linguistic classification and generalization for improving statistical machine translation
ACLstudent '05 Proceedings of the ACL Student Research Workshop
Morpho-syntactic information for automatic error analysis of statistical machine translation output
StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Initial explorations in English to Turkish statistical machine translation
StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Morpho-syntactic Arabic preprocessing for Arabic-to-English statistical machine translation
StatMT '06 Proceedings of the Workshop on Statistical Machine Translation
Statistical machine translation into a morphologically complex language
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Overview of Morpho challenge 2008
CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
Morpho challenge evaluation by information retrieval experiments
CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Unsupervised search for the optimal segmentation for statistical machine translation
ACLstudent '10 Proceedings of the ACL 2010 Student Research Workshop
IEEE Transactions on Audio, Speech, and Language Processing
A hybrid morpheme-word representation for machine translation of morphologically rich languages
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Enhancing morphological alignment for translating highly inflected languages
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Overview and results of Morpho challenge 2009
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Translating from morphologically complex languages: a paraphrase-based approach
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Aligning turkish and english parallel texts for statistical machine translation
ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
Dialectal to standard Arabic paraphrasing to improve Arabic-English statistical machine translation
DIALECTS '11 Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties
A correction model for word alignments
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Machine translation of Arabic dialects
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Machine translation without words through substring alignment
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
The study of effect of length in morphological segmentation of agglutinative languages
MM '12 Proceedings of the First Workshop on Multilingual Modeling
Substring-based machine translation
Machine Translation
Hi-index | 0.00 |
We present a novel morphological analysis technique which induces a morphological and syntactic symmetry between two languages with highly asymmetrical morphological structures to improve statistical machine translation qualities. The technique pre-supposes fine-grained segmentation of a word in the morphologically rich language into the sequence of prefix(es)-stem-suffix(es) and part-of-speech tagging of the parallel corpus. The algorithm identifies morphemes to be merged or deleted in the morphologically rich language to induce the desired morphological and syntactic symmetry. The technique improves Arabic-to-English translation qualities significantly when applied to IBM Model 1 and Phrase Translation Models trained on the training corpus size ranging from 3,500 to 3.3 million sentence pairs.