Cross Sentence Alignment for Structurally Dissimilar Corpus Based on Singular Value Decomposition

Authors:
Anna Ho;Fai Wong;Francisco Oliveira;Yiping Li
Affiliations:
Faculty of Science and Technology, University of Macau, Macau,;Faculty of Science and Technology, University of Macau, Macau,;Faculty of Science and Technology, University of Macau, Macau,;Faculty of Science and Technology, University of Macau, Macau,
Venue:
ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Artificial Intelligence
Year:
2008

Citing 7
Cited 0

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Adaptive Bilingual Sentence Alignment

AMTA '02 Proceedings of the 5th Conference of the Association for Machine Translation in the Americas on Machine Translation: From Research to Real Users
Text-translation alignment

Computational Linguistics - Special issue on using large corpora: I
Aligning sentences in parallel corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A program for aligning sentences in bilingual corpora

ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
Aligning sentences in bilingual corpora using lexical information

ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Multi-level bootstrapping for extracting parallel sentences from a quasi-comparable corpus

COLING '04 Proceedings of the 20th international conference on Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Extracting the alignment pairs is a critical step for constructing bilingual corpus knowledge base for Example Based Machine Translation Systems. Different methods have been proposed in aligning parallel corpus between two different languages. However, most of them focus on structurally similar languages like English-French. This paper presents a method of cross aligning Portuguese-Chinese bilingual comparable corpus. The proposed approach is based on Singular Value Decomposition techniques and similarity measurement, which covers the problem in aligning structurally dissimilar corpus and enhances the accuracy of the alignment result.