Models of translational equivalence among words

Authors:
I. Dan Melamed
Affiliations:
West Group
Venue:
Computational Linguistics
Year:
2000

Citing 20
Cited 91

Identifying word correspondence in parallel texts

HLT '91 Proceedings of the workshop on Speech and Natural Language
Network flows: theory, algorithms, and applications

Network flows: theory, algorithms, and applications
EuroWordNet: a multilingual database with lexical semantic networks

EuroWordNet: a multilingual database with lexical semantic networks
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Reading more into foreign languages

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Building parallel LTAG for French and Italian

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
A pattern matching method for finding noun and proper noun translations from noisy parallel corpora

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Using lexicalized tags for machine translation

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3
A statistical approach to language translation

COLING '88 Proceedings of the 12th conference on Computational linguistics - Volume 1
Building an MT dictionary from parallel texts based on linguistic and statistical information

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Towards automatic extraction of monolingual and bilingual terminology

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Word completion: a first step toward target-text mediated IMT

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Automatic detection of omissions in translations

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Should we translate the documents or the queries in cross-language information retrieval?

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
But dictionaries are data too

HLT '93 Proceedings of the workshop on Human Language Technology
Evaluation of machine translation

HLT '93 Proceedings of the workshop on Human Language Technology
One sense per collocation

HLT '93 Proceedings of the workshop on Human Language Technology
The importance of proper weighting methods

HLT '93 Proceedings of the workshop on Human Language Technology

A systematic comparison of various statistical alignment models

Computational Linguistics
Improved Named Entity Translation and Bilingual Named Entity Extraction

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
The Web as a parallel corpus

Computational Linguistics - Special issue on web as corpus
Anchor text mining for translation of Web queries: A transitive translation approach

ACM Transactions on Information Systems (TOIS)
Translating unknown queries with web corpora for cross-language information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic construction of machine translation knowledge using translation literalness

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Learning translations of named-entity phrases from parallel corpora

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Learning Chinese bracketing knowledge based on a bilingual language model

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Determining recurrent sound correspondences by inducing translation models

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A syntax-based statistical translation model

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Using similarity scoring to improve the bilingual dictionary for word alignment

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A probability model to improve word alignment

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Automatic bilingual lexicon acquisition using random indexing of parallel corpora

Natural Language Engineering
An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation

WWSM '00 Proceedings of the ACL-2000 workshop on Word senses and multi-linguality - Volume 8
Towards a simple and accurate statistical approach to learning translation relationships among words

DMMT '01 Proceedings of the workshop on Data-driven methods in machine translation - Volume 14
A comparative study on translation units for bilingual lexicon extraction

DMMT '01 Proceedings of the workshop on Data-driven methods in machine translation - Volume 14
Extensions to HMM-based statistical word alignment models

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
From words to corpora: recognizing translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Bootstrapping lexical choice via multiple-sequence alignment

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
ProAlign: shared task system description

HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Automatic extraction of named entity translingual equivalence based on multi-feature cost minimization

MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
Automatic generation of Japanese–English bilingual thesauri based on bilingual corpora

Journal of the American Society for Information Science and Technology - Research Articles
Improving IBM word-alignment model 1

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Creating multilingual translation lexicons with regional variations using web corpora

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Subsentential translation memory for computer assisted writing and translation

ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Log-linear models for word alignment

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Going beyond AER: an extensive analysis of word alignments and their impact on MT

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Extracting parallel sub-sentential fragments from non-parallel corpora

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Improving statistical word alignment with a rule-based machine translation system

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Improved word alignment using a symmetric lexicon model

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Word to word alignment strategies

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Symmetric word alignments for statistical machine translation

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
NeurAlign: combining word alignments using neural networks

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A discriminative matching approach to word alignment

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A discriminative framework for bilingual word alignment

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Alignment link projection using transformation-based learning

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Inducing a multilingual dictionary from a parallel multitext in related languages

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Biomimetic design through natural language analysis to facilitate cross-domain information retrieval

Artificial Intelligence for Engineering Design, Analysis and Manufacturing
Soft syntactic constraints for word alignment through discriminative training

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
On the application of different evolutionary algorithms to the alignment problem in statistical machine translation

Neurocomputing
Finding translations for low-frequency words in comparable corpora

Machine Translation
Statistical machine translation

ACM Computing Surveys (CSUR)
Two-Stage Hypotheses Generation for Spoken Language Translation

ACM Transactions on Asian Language Information Processing (TALIP)
Query Classification and Expansion for Translation Mining Via Search Engines

PRICAI '08 Proceedings of the 10th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Bilingually Motivated Word Segmentation for Statistical Machine Translation

ACM Transactions on Asian Language Information Processing (TALIP)
Identifying linguistic structure in a quantitative analysis of dialect pronunciation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Translation disambiguation for cross-language information retrieval using context-based translation probability

Journal of Information Science
Regenerating hypotheses for statistical machine translation

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Linguistically-based sub-sentential alignment for terminology extraction from a bilingual automotive corpus

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
An alignment algorithm using belief propagation and a structure-based distortion model

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Bilingually motivated domain-adapted word segmentation for statistical machine translation

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Selective phrase pair extraction for improved statistical machine translation

NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Semi-supervised lexicon mining from parenthetical expressions in monolingual web pages

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Acquiring word-meaning mappings for natural language interfaces

Journal of Artificial Intelligence Research
Improving word alignment using syntactic dependencies

SSST '08 Proceedings of the Second Workshop on Syntax and Structure in Statistical Translation
NUS at WMT09: domain adaptation experiments for English-Spanish machine translation of news commentary text

StatMT '09 Proceedings of the Fourth Workshop on Statistical Machine Translation
Bayesian identification of cognates and correspondences

SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
Creating a comparative dictionary of Totonac-Tepehua

SigMorPhon '07 Proceedings of Ninth Meeting of the ACL Special Interest Group in Computational Morphology and Phonology
An unsupervised method for multilingual word sense tagging using parallel corpora: a preliminary investigation

WorkSense '00 Proceedings of the ACL-2000 Workshop on Word Senses and Multi-Linguality
Automatic acquisition of bilingual rules for extraction of bilingual word pairs from parallel corpora

DeepLA '05 Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition
Evaluation of several phonetic similarity algorithms on the task of cognate identification

LD '06 Proceedings of the Workshop on Linguistic Distances
Association-based bilingual word alignment

ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Better word alignments with supervised ITG models

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
A comparative study of hypothesis alignment and its improvement for machine translation system combination

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
Phrase Translation Extraction from Aligned Parallel Corpora Using Suffix Arrays and Related Structures

EPIA '09 Proceedings of the 14th Portuguese Conference on Artificial Intelligence: Progress in Artificial Intelligence
Extending statistical machine translation with discriminative and trigger-based lexicon models

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Improved word alignment with statistics and linguistic heuristics

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Improved statistical machine translation for resource-poor languages using related resource-rich languages

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Cross-lingual annotation projection of semantic roles

Journal of Artificial Intelligence Research
Dynamic translation memory: using statistical machine translation to improve translation memory fuzzy matches

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Discriminative modeling of extraction sets for machine translation

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Transliteration generation and mining with limited training resources

NEWS '10 Proceedings of the 2010 Named Entities Workshop
Machine transliteration survey

ACM Computing Surveys (CSUR)
Improved unsupervised sentence alignment for symmetrical and asymmetrical parallel corpora

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Discriminative word alignment by linear modeling

Computational Linguistics
Word alignment via submodular maximization over matroids

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Parallel corpora and WordSpace models: using a third language as an interlingua to enrich multilingual resources

International Journal of Information and Communication Technology
Acquiring translational equivalence from a japanese-chinese parallel corpus

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Improving phrase-based statistical translation through combination of word alignments

FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
Bilingual sentence alignment based on punctuation statistics and lexicon

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Dynamic lexica for query translation

CLEF'04 Proceedings of the 5th conference on Cross-Language Evaluation Forum: multilingual Information Access for Text, Speech and Images
A chunk-driven bootstrapping approach to extracting translation patterns

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Mining parenthetical translations for polish-english lexica

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Translation techniques in cross-language information retrieval

ACM Computing Surveys (CSUR)
Detecting highly confident word translations from comparable corpora without any prior knowledge

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Improving statistical machine translation for a resource-poor language using related resource-rich languages

Journal of Artificial Intelligence Research
Using context and phonetic features in models of etymological sound change

EACL 2012 Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH
Combining word-level and character-level models for machine translation between closely-related languages

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
What types of word alignment improve statistical machine translation?

Machine Translation
An empirical study on word segmentation for chinese machine translation

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Generalizing sampling-based multilingual alignment

Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Parallel texts (bitexts) have properties that distinguish them from other kinds of parallel data. First, most words translate to only one other word. Second, bitext correspondence is typically only partial---many words in each text have no clear equivalent in the other text. This article presents methods for biasing statistical translation models to reflect these properties. Evaluation with respect to independent human judgments has confirmed that translation models biased in this fashion are significantly more accurate than a baseline knowledge-free model. This article also shows how a statistical translation model can take advantage of preexisting knowledge that might be available about particular language pairs. Even the simplest kinds of language-specific knowledge, such as the distinction between content words and function words, are shown to reliably boost translation model performance on some tasks. Statistical models that reflect knowledge about the model domain combine the best of both the rationalist and empiricist paradigms.