Machine translation: past, present, future
Machine translation: past, present, future
Building a large-scale knowledge base for machine translation
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Querying across languages: a dictionary-based approach to multilingual information retrieval
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Beyond the flow decomposition barrier
Journal of the ACM (JACM)
Measuring index quality using random walks on the Web
WWW '99 Proceedings of the eighth international conference on World Wide Web
EuroWordNet: a multilingual database with lexical semantic networks
EuroWordNet: a multilingual database with lexical semantic networks
SIAM Journal on Computing
Improving cross language retrieval with triangulated translation
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
A word-to-word model of translational equivalence
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A program for aligning sentences in bilingual corpora
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A pattern matching method for finding noun and proper noun translations from noisy parallel corpora
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3
Multipath translation lexicon induction via bridge languages
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Statistical phrase-based translation
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Inducing translation lexicons via diverse similarity measures and bridge languages
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
A hierarchical phrase-based model for statistical machine translation
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
A discriminative framework for bilingual word alignment
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Information arbitrage across multi-lingual Wikipedia
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Amplifying community content creation with mixed initiative information extraction
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Word alignment for languages with scarce resources
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
A rose is a roos is a ruusu: querying translations for web image search
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Compiling a massive, multilingual dictionary via probabilistic inference
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
PanLex and LEXTRACT: translating all words of all languages of the world
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations
The CQC algorithm: cycling in graphs to semantically enrich and enhance a bilingual dictionary
Journal of Artificial Intelligence Research
Collaboratively built semi-structured content and Artificial Intelligence: The story so far
Artificial Intelligence
Hi-index | 0.00 |
This paper introduces a novel approach to the task of lexical translation between languages for which no translation dictionaries are available. We build a massive translation graph, automatically constructed from over 630 machine-readable dictionaries and Wiktionaries. In this graph each node denotes a word in some language and each edge (v"i,v"j) denotes a word sense shared by v"i and v"j. Our current graph contains over 10,000,000 nodes and expresses more than 60,000,000 pairwise translations. The composition of multiple translation dictionaries leads to a transitive inference problem: if word A translates to word B which in turn translates to word C, what is the probability that C is a translation of A? The paper describes a series of probabilistic inference algorithms that solve this problem at varying precision and recall levels. All algorithms enable us to quantify our confidence in a translation derived from the graph, and thus trade precision for recall. We compile the results of our best inference algorithm to yield PanDictionary, a novel multilingual dictionary. PanDictionary contains more than four times as many translations as in the largest Wiktionary at precision 0.90 and over 200,000,000 pairwise translations in over 200,000 language pairs at precision 0.8.