A joint model to identify and align bilingual named entities

Authors:
Yufeng Chen;Chengqing Zong;Keh-Yih Su
Affiliations:
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences;National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences;Behavior Design Corporation
Venue:
Computational Linguistics
Year:
2013

Citing 46
Cited 0

A maximum entropy approach to natural language processing

Computational Linguistics
An Algorithm that Learns What‘s in a Name

Machine Learning - Special issue on natural language learning
A systematic comparison of various statistical alignment models

Computational Linguistics
Improved Named Entity Translation and Bilingual Named Entity Extraction

ICMI '02 Proceedings of the 4th IEEE International Conference on Multimodal Interfaces
A maximum entropy approach to named entity recognition

A maximum entropy approach to named entity recognition
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Machine transliteration

Computational Linguistics
Nymble: a high-performance learning name-finder

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Automatic English-Chinese name transliteration for development of multilingual resources

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Proper name translation in cross-language information retrieval

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Machine learning-based named entity recognition via effective integration of various evidences

Natural Language Engineering
Learning translations of named-entity phrases from parallel corpora

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
An English-Korean transliteration model using pronunciation and contextual rules

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Translating named entities using monolingual and bilingual resources

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Ranking algorithms for named-entity extraction: boosting and the voted perceptron

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach

Computational Linguistics
Language independent named entity classification by modified transformation-based learning and by decision tree induction

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Named Entity Extraction using AdaBoost

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Backward machine transliteration by learning phonetic similarity

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Boosting for named entity recognition

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Acquisition of English-Chinese transliterated word pairs from parallel-aligned texts using a statistical machine transliteration model

HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
Learning formulation and transformation rules for multilingual named entities

MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
Automatic extraction of named entity translingual equivalence based on multi-feature cost minimization

MultiNER '03 Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition - Volume 15
Translating–transliterating named entities for multilingual information access

Journal of the American Society for Information Science and Technology
Toward Practical Spoken Language Translation

Machine Translation
Alignment of bilingual named entities in parallel corpora using statistical models and multiple knowledge sources

ACM Transactions on Asian Language Information Processing (TALIP)
A joint source-channel model for machine transliteration

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Incorporating non-local information into information extraction systems by Gibbs sampling

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Named entity transliteration with comparable corpora

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Semi-supervised conditional random fields for improved sequence segmentation and labeling

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
An effective two-stage model for exploiting non-local dependencies in named entity recognition

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Mining new word translations from comparable corpora

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Analysis and repair of name tagger errors

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A Structure-Based Model for Chinese Organization Name Translation

ACM Transactions on Asian Language Information Processing (TALIP)
Translating names and technical terms in Arabic text

Semitic '98 Proceedings of the Workshop on Computational Approaches to Semitic Languages
A simple semi-supervised algorithm for named entity recognition

SemiSupLearn '09 Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing
One class per named entity: exploiting unlabeled text for named entity recognition

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Transliteration alignment

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
On jointly recognizing and aligning bilingual named entities

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Phoneme-Based transliteration of foreign names for OOV problem

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Acquiring bilingual named entity translations from content-aligned corpora

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
An ensemble of grapheme and phoneme for machine transliteration

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
A phrase-based context-dependent joint probability model for named entity translation

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
An approach to automatic acquisition of translation templates based on phrase structure extraction and alignment

IEEE Transactions on Audio, Speech, and Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this article, an integrated model is derived that jointly identifies and aligns bilingual named entities NEs between Chinese and English. The model is motivated by the following observations: 1 whether an NE is translated semantically or phonetically depends greatly on its entity type, 2 entities within an aligned pair should share the same type, and 3 the initially detected NEs can act as anchors and provide further information while selecting NE candidates. Based on these observations, this article proposes a translation mode ratio feature defined as the proportion of NE internal tokens that are semantically translated, enforces an entity type consistency constraint, and utilizes additional new NE likelihoods based on the initially detected NE anchors. Experiments show that this novel method significantly outperforms the baseline. The type-insensitive F-score of identified NE pairs increases from 78.4% to 88.0% 12.2% relative improvement in our Chinese-English NE alignment task, and the type-sensitive F-score increases from 68.4% to 83.0% 21.3% relative improvement. Furthermore, the proposed model demonstrates its robustness when it is tested across different domains. Finally, when semi-supervised learning is conducted to train the adopted English NE recognition model, the proposed model also significantly boosts the English NE recognition type-sensitive F-score.