An Algorithm that Learns What‘s in a Name
Machine Learning - Special issue on natural language learning
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A technique for computer detection and correction of spelling errors
Communications of the ACM
Adaptive Bilingual Sentence Alignment
AMTA '02 Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users
A systematic comparison of various statistical alignment models
Computational Linguistics
Automatic construction of English/Chinese parallel corpora
Journal of the American Society for Information Science and Technology
A maximum entropy approach to named entity recognition
A maximum entropy approach to named entity recognition
Introduction to the special issue on the web as corpus
Computational Linguistics - Special issue on web as corpus
Computational Linguistics - Special issue on web as corpus
Embedding web-based statistical translation models in cross-language information retrieval
Computational Linguistics - Special issue on web as corpus
The mathematics of statistical machine translation: parameter estimation
Computational Linguistics - Special issue on using large corpora: II
Computational Linguistics
Automatic English-Chinese name transliteration for development of multilingual resources
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Proper name translation in cross-language information retrieval
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Anchor text mining for translation of Web queries: A transitive translation approach
ACM Transactions on Information Systems (TOIS)
Word identification for Mandarin Chinese sentences
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 1
Translating unknown queries with web corpora for cross-language information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Using the web for automated translation extraction in cross-language information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Learning translations of named-entity phrases from parallel corpora
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
An English-Korean transliteration model using pronunciation and contextual rules
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Translating named entities using monolingual and bilingual resources
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Named entity recognition using an HMM-based chunk tagger
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Named Entity Extraction using AdaBoost
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Backward machine transliteration by learning phonetic similarity
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Boosting for named entity recognition
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Learning formulation and transformation rules for multilingual named entities
MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
Translating names and technical terms in Arabic text
Semitic '98 Proceedings of the Workshop on Computational Approaches to Semitic Languages
Extraction of transliteration pairs from parallel corpora using a statistical transliteration model
Information Sciences: an International Journal
Acquiring bilingual named entity translations from content-aligned corpora
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Measuring similarity between transliterations against noise data
ACM Transactions on Asian Language Information Processing (TALIP)
A Structure-Based Model for Chinese Organization Name Translation
ACM Transactions on Asian Language Information Processing (TALIP)
Synonymous Chinese Transliterations Retrieval from World Wide Web by Using Association Words
ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
English-Chinese bi-directional OOV translation based on web mining and supervised learning
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Mining Synonymous Transliterations from the World Wide Web
ACM Transactions on Asian Language Information Processing (TALIP)
On jointly recognizing and aligning bilingual named entities
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Machine transliteration survey
ACM Computing Surveys (CSUR)
Fusion of multiple features and ranking SVM for web-based English-Chinese OOV term translation
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
A joint model to identify and align bilingual named entities
Computational Linguistics
Hi-index | 0.00 |
Named entity (NE) extraction is one of the fundamental tasks in natural language processing (NLP). Although many studies have focused on identifying NEs within monolingual documents, aligning NEs in bilingual documents has not been investigated extensively due to the complexity of the task. In this article we introduce a new approach to aligning bilingual NEs in parallel corpora by incorporating statistical models with multiple knowledge sources. In our approach, we model the process of translating an English NE phrase into a Chinese equivalent using lexical translation/transliteration probabilities for word translation and alignment probabilities for word reordering. The method involves automatically learning phrase alignment and acquiring word translations from a bilingual phrase dictionary and parallel corpora, and automatically discovering transliteration transformations from a training set of name-transliteration pairs. The method also involves language-specific knowledge functions, including handling abbreviations, recognizing Chinese personal names, and expanding acronyms. At runtime, the proposed models are applied to each source NE in a pair of bilingual sentences to generate and evaluate the target NE candidates; the source and target NEs are then aligned based on the computed probabilities. Experimental results demonstrate that the proposed approach, which integrates statistical models with extra knowledge sources, is highly feasible and offers significant improvement in performance compared to our previous work, as well as the traditional approach of IBM Model 4.