A stochastic finite-state word-segmentation algorithm for Chinese
Computational Linguistics
Resolving ambiguity for cross-language retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
The TREC-5 Confusion Track: Comparing Retrieval Methods for Scanned Text
Information Retrieval
Computational Linguistics - Special issue on using large corpora: I
Computational Linguistics
A pattern matching method for finding noun and proper noun translations from noisy parallel corpora
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Identifying word translations in non-parallel texts
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
A bootstrapping method for extracting bilingual text pairs
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Extraction of lexical translations from non-aligned corpora
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Using the web for automated translation extraction in cross-language information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Mining comparable bilingual text corpora for cross-language information integration
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Identification and tracing of ambiguous names: discriminative and generative approaches
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Unsupervised named entity transliteration using temporal and phonetic correlation
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Phoneme-Based transliteration of foreign names for OOV problem
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Mining correlated bursty topic patterns from coordinated text streams
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A Structure-Based Model for Chinese Organization Name Translation
ACM Transactions on Asian Language Information Processing (TALIP)
Active learning for constructing transliteration lexicons from the Web
Journal of the American Society for Information Science and Technology
English-Arabic proper-noun transliteration-pairs creation
Journal of the American Society for Information Science and Technology
Harvesting Regional Transliteration Variants with Guided Search
ICCPOL '09 Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy
Unsupervised named entity transliteration using temporal and phonetic correlation
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Learning to match names across languages
MMIES '08 Proceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization
Learning phoneme mappings for transliteration without parallel data
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Unsupervised constraint driven learning for transliteration discovery
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Integration of an Arabic transliteration module into a statistical machine translation system
StatMT '07 Proceedings of the Second Workshop on Statistical Machine Translation
Web-Based Transliteration of Person Names
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Mining name translations from comparable corpora by creating bilingual information networks
BUCC '09 Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora
Exploiting comparable corpora with TER and TERp
BUCC '09 Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora
Report of NEWS 2009 machine transliteration shared task
NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
Transliteration for Resource-Scarce Languages
ACM Transactions on Asian Language Information Processing (TALIP)
Improving the multilingual user experience of Wikipedia using cross-language name search
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Report of NEWS 2010 transliteration generation shared task
NEWS '10 Proceedings of the 2010 Named Entities Workshop
Machine transliteration survey
ACM Computing Surveys (CSUR)
Fusion of multiple features and ranking SVM for web-based English-Chinese OOV term translation
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Machine transliteration: leveraging on third languages
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
An algorithm for unsupervised transliteration mining with an application to word alignment
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Cross-lingual slot filling from comparable corpora
BUCC '11 Proceedings of the 4th Workshop on Building and Using Comparable Corpora: Comparable Corpora and the Web
Mining English-Chinese Named Entity Pairs from Comparable Corpora
ACM Transactions on Asian Language Information Processing (TALIP)
Learning regional transliteration variants
Information Processing and Management: an International Journal
Parallel sentence generation from comparable corpora for improved SMT
Machine Translation
Learning inter-related statistical query translation models for English-Chinese bi-directional CLIR
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Leveraging supplemental representations for sequential transduction
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Efficient Entity Translation Mining: A Parallelized Graph Alignment Approach
ACM Transactions on Information Systems (TOIS)
Regularized interlingual projections: evaluation on multilingual transliteration
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Universal grapheme-to-phoneme prediction over Latin alphabets
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Report of NEWS 2012 machine transliteration shared task
NEWS '12 Proceedings of the 4th Named Entity Workshop
A joint model to identify and align bilingual named entities
Computational Linguistics
Hi-index | 0.00 |
In this paper we investigate Chinese-English name transliteration using comparable corpora, corpora where texts in the two languages deal in some of the same topics --- and therefore share references to named entities --- but are not translations of each other. We present two distinct methods for transliteration, one approach using phonetic transliteration, and the second using the temporal distribution of candidate pairs. Each of these approaches works quite well, but by combining the approaches one can achieve even better results. We then propose a novel score propagation method that utilizes the co-occurrence of transliteration pairs within document pairs. This propagation method achieves further improvement over the best results from the previous step.