Automatic extraction of named entity translingual equivalence based on multi-feature cost minimization

Authors:
Fei Huang;Stephan Vogel;Alex Waibel
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA
Venue:
MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
Year:
2003

Citing 9
Cited 17

Improved Named Entity Translation and Bilingual Named Entity Extraction

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
Models of translational equivalence among words

Computational Linguistics
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Nymble: a high-performance learning name-finder

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Machine transliteration

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
HMM-based word alignment in statistical translation

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Translating named entities using monolingual and bilingual resources

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Translation with cascaded finite state transducers

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Translating names and technical terms in Arabic text

Semitic '98 Proceedings of the Workshop on Computational Approaches to Semitic Languages

Extracting named entity translingual equivalence with limited resources

ACM Transactions on Asian Language Information Processing (TALIP)
Mining translations of OOV terms from the web through cross-lingual query expansion

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Alignment of bilingual named entities in parallel corpora using statistical models and multiple knowledge sources

ACM Transactions on Asian Language Information Processing (TALIP)
Named entity translation matching and learning: With application for mining unseen translations

ACM Transactions on Information Systems (TOIS)
Cluster-specific named entity transliteration

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Mining key phrase translations from web corpora

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A Structure-Based Model for Chinese Organization Name Translation

ACM Transactions on Asian Language Information Processing (TALIP)
When Harry met Harri: cross-lingual name spelling normalization

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
An integrated approach for Arabic-English named entity translation

Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
A Chinese-English organization name translation system using heuristic web mining and asymmetric alignment

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Chinese-English organization name translation based on correlative expansion

NEWS '09 Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration
On jointly recognizing and aligning bilingual named entities

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Mining entity translations from comparable corpora: a holistic graph mapping approach

Proceedings of the 20th ACM international conference on Information and knowledge management
Multilingual machine translation of closed captions for digital television with dynamic dictionary adaptation

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Acquisition of translation knowledge of syntactically ambiguous named entity

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Learning to find translations and transliterations on the web

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
A joint model to identify and align bilingual named entities

Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Translingual equivalence refers to the relationship between expressions of the same meaning from different languages. Identifying translingual equivalence of named entities (NE) can significantly contribute to multilingual natural language processing, such as crosslingual information retrieval, crosslingual information extraction and statistical machine translation. In this paper we present an integrated approach to extract NE translingual equivalence from a parallel Chinese-English corpus.Starting from a bilingual corpus where NEs are automatically tagged for each language, NE pairs are aligned in order to minimize the overall multi-feature alignment cost. An NE transliteration model is presented and iteratively trained using named entity pairs extracted from a bilingual dictionary. The transliteration cost, combined with the named entity tagging cost and word-based translation cost, constitute the multi-feature alignment cost. These features are derived from several information sources using unsupervised and partly supervised methods. A greedy search algorithm is applied to minimize the alignment cost. Experiments show that the proposed approach extracts NE translingual equivalence with 81% F-score and improves the translation score from 7.68 to 7.74.