A class-based approach to word alignment

Authors:
Sue J. Ker;Jason S. Chang
Affiliations:
National Tsing Hua University;National Tsing Hua University
Venue:
Computational Linguistics
Year:
1997

Citing 20
Cited 21

A statistical approach to machine translation

Computational Linguistics
Identifying word correspondence in parallel texts

HLT '91 Proceedings of the workshop on Speech and Natural Language
Using multiple knowledge sources for word sense discrimination

Computational Linguistics
A comparison of indexing techniques for Japanese text retrieval

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Class-based n-gram models of natural language

Computational Linguistics
Translating collocations for bilingual lexicons: a statistical approach

Computational Linguistics
A program for aligning sentences in bilingual corpora

Computational Linguistics - Special issue on using large corpora: I
Text-translation alignment

Computational Linguistics - Special issue on using large corpora: I
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Automating the acquisition of bilingual terminology

EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
Text alignment in a tool for translating revised documents

EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
Aligning sentences in parallel corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A program for aligning sentences in bilingual corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Aligning sentences in bilingual corpora using lexical information

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
An algorithm for finding noun phrase correspondences in bilingual corpora

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Structural matching of parallel texts

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Aligning a parallel English-Chinese corpus statistically with lexical criteria

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
The BICORD system: combining lexical information from bilingual corpora and machine readable dictionaries

COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3
Towards automatic extraction of monolingual and bilingual terminology

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Alignment of shared forests for bilingual corpora

COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1

Alignment and Matching of Bilingual English–Chinese News Texts

Machine Translation
A Multilingual Procedure for Dictionary-Based Sentence Alignment

AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
Taxonomy and Lexical Semantics - From the Perspective of Machine Readable Dictionaries

AMTA '98 Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup
Adaptive Bilingual Sentence Alignment

AMTA '02 Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users
A systematic comparison of various statistical alignment models

Computational Linguistics
An alignment method for noisy parallel corpora based on image processing techniques

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Chinese-Korean word alignment based on linguistic comparison

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Word alignment of English-Chinese bilingual corpus based on chunks

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
Building a training corpus for word sense disambiguation in English-to-Vietnamese machine translation

COLING-MTIA '02 Proceedings of the 2002 COLING workshop on Machine translation in Asia - Volume 16
Automatic generation of Japanese–English bilingual thesauri based on bilingual corpora

Journal of the American Society for Information Science and Technology - Research Articles
Improving domain-specific word alignment for computer assisted translation

ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
Log-linear models for word alignment

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Alignment model adaptation for domain-specific word alignment

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Sentence alignment using P-NNT and GMM

Computer Speech and Language
Word alignment for languages with scarce resources using bilingual corpora of other language pairs

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
A word alignment model based on multiobjective evolutionary algorithms

Computers & Mathematics with Applications
User-induced links in collaborative tagging systems

Proceedings of the 18th ACM conference on Information and knowledge management
Word alignment between chinese and japanese using maximum weight matching on bipartite graph

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Acquiring translational equivalence from a japanese-chinese parallel corpus

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Bilingual semantic network construction

ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part II
Refining lexical translation training scheme for improving the quality of statistical phrase-based translation

Proceedings of the Third Symposium on Information and Communication Technology

Quantified Score

Hi-index	0.01

Visualization

Abstract

This paper presents an algorithm capable of identifying the translation for each word in a bilingual corpus. Previously proposed methods rely heavily on word-based statistics. Under a word-based approach, frequent words with a consistent translation can be aligned at a high rate of precision. However, words that are less frequent or exhibit diverse translations generally do not have statistically significant evidence for confident alignment, thereby leading to incomplete or incorrect alignments. The algorithm proposed herein attempts to broaden coverage by exploiting lexicographic resources. To this end, we draw on the two classification systems of words in Longman Lexicon of Contemporary English (LLOCE) and Tongyici Cilin (Synonym Forest, CILIN). Automatically acquired class-based alignment rules are used to compensate for what is lacking in a bilingual dictionary such as the English-Chinese version of the Longman Dictionary of Contemporary English (LecDOCE). In addition, this alignment method is implemented using LecDOCE examples and their translations for training and testing, while further examples from a technical manual in both English and Chinese are used for an open test. Quantitative results of the closed and open tests are also summarized.