Identifying word correspondence in parallel texts
HLT '91 Proceedings of the workshop on Speech and Natural Language
Translating collocations for bilingual lexicons: a statistical approach
Computational Linguistics
Multilingual information discovery and access (MIDAS)
Proceedings of the fourth ACM conference on Digital libraries
Encyclopedia of Artificial Intelligence
Encyclopedia of Artificial Intelligence
Statistical transliteration for english-arabic cross language information retrieval
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Models of translational equivalence among words
Computational Linguistics
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
A class-based approach to word alignment
Computational Linguistics
Automating the acquisition of bilingual terminology
EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Automatic English-Chinese name transliteration for development of multilingual resources
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
A simple hybrid aligner for generating lexical correspondences in parallel texts
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Flow network models for word alignment and terminology extraction from bilingual corpora
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
An algorithm for finding noun phrase correspondences in bilingual corpora
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
A pattern matching method for finding noun and proper noun translations from noisy parallel corpora
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
An algorithm for simultaneously bracketing parallel texts by aligning words
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Automatic thesaurus generation through multiple filtering
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
English-to-Korean transliteration using multiple unbounded overlapping phoneme chunks
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Towards automatic extraction of monolingual and bilingual terminology
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Alignment of shared forests for bilingual corpora
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Finding structural correspondences from bilingual parsed corpus for corpus-based translation
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Acquisition of phrase-level bilingual correspondence using dependency structure
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
HMM-based word alignment in statistical translation
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Word alignment of English-Chinese bilingual corpus based on chunks
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Extensions to HMM-based statistical word alignment models
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Minimum Bayes-Risk word alignments of bilingual texts
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Journal of Information Science
An approach for extracting bilingual terminology from Wikipedia
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
A bilingual dictionary extracted from the Wikipedia link structure
DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications
Dialect translation: integrating Bayesian co-segmentation models with pivot-based SMT
DIALECTS '11 Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties
A Bayesian Alignment Approach to Transliteration Mining
ACM Transactions on Asian Language Information Processing (TALIP)
Hi-index | 0.00 |
The authors propose a method for automatically generating Japanese–English bilingual thesauri based on bilingual corpora. The term bilingual thesaurus refers to a set of bilingual equivalent words and their synonyms. Most of the methods proposed so far for extracting bilingual equivalent word clusters from bilingual corpora depend heavily on word frequency and are not effective for dealing with low-frequency clusters. These low-frequency bilingual clusters are worth extracting because they contain many newly coined terms that are in demand but are not listed in existing bilingual thesauri. Assuming that single language-pair-independent methods such as frequency-based ones have reached their limitations and that a language-pair-dependent method used in combination with other methods shows promise, the authors propose the following approach: (a) Extract translation pairs based on transliteration patterns; (b) remove the pairs from among the candidate words; (c) extract translation pairs based on word frequency from the remaining candidate words; and (d) generate bilingual clusters based on the extracted pairs using a graph-theoretic method. The proposed method has been found to be significantly more effective than other methods. © 2006 Wiley Periodicals, Inc.